Data validation

In computing, data validation or input validation is the process of ensuring data has undergone data cleansing to confirm they have data quality, that is, that they are both correct and useful. It uses routines, often called "validation rules", "validation constraints", or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic of the computer and its application.

This is distinct from formal verification, which attempts to prove or disprove the correctness of algorithms for implementing a specification or property.

Overview

Data validation is intended to provide certain well-defined guarantees for fitness and consistency of data in an application or automated system. Data validation rules can be defined and designed using various methodologies, and be deployed in various contexts.[1] Their implementation can use declarative data integrity rules, or procedure-based business rules.[2]

The guarantees of data validation do not necessarily include accuracy, and it is possible for data entry errors such as misspellings to be accepted as valid. Other clerical and/or computer controls may be applied to reduce inaccuracy within a system.

Different kinds

In evaluating the basics of data validation, generalizations can be made regarding the different kinds of validation according to their scope, complexity, and purpose.

For example:

  • Data type validation;
  • Range and constraint validation;
  • Code and cross-reference validation;
  • Structured validation; and
  • Consistency validation

Data-type check

Data type validation is customarily carried out on one or more simple data fields.

The simplest kind of data type validation verifies that the individual characters provided through user input are consistent with the expected characters of one or more known primitive data types as defined in a programming language or data storage and retrieval mechanism.

For example, an integer field may require input to use only characters 0 through 9.

Simple range and constraint check

Simple range and constraint validation may examine input for consistency with a minimum/maximum range, or consistency with a test for evaluating a sequence of characters, such as one or more tests against regular expressions. For example, a counter value may be required to be a non-negative integer, and a password may be required to meet a minimum length and contain characters from multiple categories.

Code and cross-reference check

Code and cross-reference validation includes operations to verify that data is consistent with one or more possibly-external rules, requirements, or collections relevant to a particular organization, context or set of underlying assumptions. These additional validity constraints may involve cross-referencing supplied data with a known look-up table or directory information service such as LDAP.

For example, a user-provided country code might be required to identify a current geopolitical region.

Structured check

Structured validation allows for the combination of other kinds of validation, along with more complex processing. Such complex processing may include the testing of conditional constraints for an entire complex data object or set of process operations within a system.

Consistency check

Consistency validation ensures that data is logical. For example, the delivery date of an order can be prohibited from preceding its shipment date.

Example

Multiple kinds of data validation are relevant to 10-digit pre-2007 ISBNs (the 2005 edition of ISO 2108 required ISBNs to have 13 digits from 2007 onwards[3]).

  • Size. A pre-2007 ISBN must consist of 10 digits, with optional hyphens or spaces separating its four parts.
  • Format checks. Each of the first 9 digits must be 0 through 9, and the 10th must be either 0 through 9 or an X.
  • Check digit. To detect transcription errors in which digits have been altered or transposed, the last digit of a pre-2007 ISBN must match the result of a mathematical formula incorporating the other 9 digits (ISBN-10 check digits).

Validation types

Allowed character checks
Checks to ascertain that only expected characters are present in a field. For example a numeric field may only allow the digits 0–9, the decimal point and perhaps a minus sign or commas. A text field such as a personal name might disallow characters used for markup. An e-mail address might require at least one @ sign and various other structural details. Regular expressions can be effective ways to implement such checks.
Batch totals
Checks for missing records. Numerical fields may be added together for all records in a batch. The batch total is entered and the computer checks that the total is correct, e.g., add the 'Total Cost' field of a number of transactions together.
Cardinality check
Checks that record has a valid number of related records. For example, if a contact record is classified as "customer" then it must have at least one associated order (cardinality > 0). This type of rule can be complicated by additional conditions. For example, if a contact record in a payroll database is classified as "former employee" then it must not have any associated salary payments after the separation date (cardinality = 0).
Check digits
Used for numerical data. To support error detection, an extra digit is added to a number which is calculated from the other digits.
Consistency checks
Checks fields to ensure data in these fields correspond, e.g., if expiration date is in the past then status is not "active".
Cross-system consistency checks
Compares data in different systems to ensure it is consistent. Systems may represent the same data differently, in which case comparison requires transformation (e.g., one system may store customer name in a single Name field as 'Doe, John Q', while another uses First_Name 'John' and Last_Name 'Doe' and Middle_Name 'Quality').
Data type checks
Checks input conformance with typed data. For example, an input box accepting numeric data may reject the letter 'O'.
File existence check
Checks that a file with a specified name exists. This check is essential for programs that use file handling.
Format check
Checks that the data is in a specified format (template), e.g., dates have to be in the format YYYY-MM-DD. Regular expressions may be used for this kind of validation.
Presence check
Checks that data is present, e.g., customers may be required to have an email address.
Range check
Checks that the data is within a specified range of values, e.g., a probability must be between 0 and 1.
Referential integrity
Values in two relational database tables can be linked through foreign key and primary key. If values in the foreign key field are not constrained by internal mechanisms, then they should be validated to ensure that the referencing table always refers to a row in the referenced table.
Spelling and grammar check
Looks for spelling and grammatical errors.
Uniqueness check
Checks that each value is unique. This can be applied to several fields (i.e. Address, First Name, Last Name).
Table look up check
A table look up check compares data to a collection of allowed values.

Post-validation actions

Enforcement Action
Enforcement action typically rejects the data entry request and requires the input actor to make a change that brings the data into compliance. This is most suitable for interactive use, where a real person is sitting on the computer and making entry. It also works well for batch upload, where a file input may be rejected and a set of messages sent back to the input source for why the data is rejected.
Another form of enforcement action involves automatically changing the data and saving a conformant version instead of the original version. This is most suitable for cosmetic change. For example, converting an [all-caps] entry to a [Pascal case] entry does not need user input. An inappropriate use of automatic enforcement would be in situations where the enforcement leads to loss of business information. For example, saving a truncated comment if the length is longer than expected. This is not typically a good thing since it may result in loss of significant data.
Advisory Action
Advisory actions typically allow data to be entered unchanged but sends a message to the source actor indicating those validation issues that were encountered. This is most suitable for non-interactive system, for systems where the change is not business critical, for cleansing steps of existing data and for verification steps of an entry process.
Verification Action
Verification actions are special cases of advisory actions. In this case, the source actor is asked to verify that this data is what they would really want to enter, in the light of a suggestion to the contrary. Here, the check step suggests an alternative (e.g., a check of a mailing address returns a different way of formatting that address or suggests a different address altogether). You would want in this case, to give the user the option of accepting the recommendation or keeping their version. This is not a strict validation process, by design and is useful for capturing addresses to a new location or to a location that is not yet supported by the validation databases.
Log of validation
Even in cases where data validation did not find any issues, providing a log of validations that were conducted and their results is important. This is helpful to identify any missing data validation checks in light of data issues and in improving the validation.

Validation and security

Failures or omissions in data validation can lead to data corruption or a security vulnerability.[4] Data validation checks that data are fit for purpose,[5] valid, sensible, reasonable and secure before they are processed.

See also

References

Read other articles:

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada November 2022. Dorsa DerakhshaniDerakhshani pada tahun 2017Asal negaraIranAmerika SerikatLahir15 April 1998 (umur 25)Teheran, IranGelarMaster Internasional (2016)Grandmaster Wanita (2016)Rating tertinggi2405 (Juli 2016) Dorsa Derakhshani (Persia: د

سيدة الأقمار السوداءملصق الفيلممعلومات عامةالصنف الفني sexploitation film (en) تاريخ الصدور 1971مدة العرض 106 دقائقاللغة الأصلية العربيةالبلد مصر لبنانالطاقمالمخرج سمير أ.خوريالكاتب سمير أ.خوريالبطولة ناهد يسريحسين فهميعادل أدهمالموسيقى باتريك سمسونسوسو خوريصناعة سينمائيةالمنتج أ

التجمع السكني سوق السبت بني يخلف تقسيم إداري البلد المغرب  الجهة بني ملال خنيفرة الإقليم خريبكة الدائرة خريبكة الجماعة القروية بني يخلف المشيخة السوق السبت السكان التعداد السكاني 134 نسمة (إحصاء 2004)   • عدد الأسر 28 معلومات أخرى التوقيت ت ع م±00:00 (توقيت قياسي)[1]،  و...

У Вікіпедії є статті про інші значення цього терміна: Раковіца. комуна РаковіцаRacovița Країна  Румунія Повіт  Бреїла Телефонний код +40 239 (Romtelecom, TR)+40 339 (інші оператори) Координати 45°17′55″ пн. ш. 27°28′02″ сх. д.H G O Висота 23 м.н.р.м. Площа 45,33 км² Населення 1231[1] (200...

Wapen van de familie de Biseau De Biseau, ook De Biseau d'Hauteville, was een familie van Zuid-Nederlandse adel. Geschiedenis De bewezen stamreeks begint met Jean de Biseau uit Valenciennes die in 1521 trouwde, eerste vermelding van een telg van dit geslacht. In 1690 werd door koning Karel II van Spanje wapenvermeerdering verleend aan Emmanuel de Biseau de Beusdael, zoon van Jean de Biseau, luitenant-gouverneur van de stad en het provoostschap van Binche, en aan Pierre de Biseau, eerste schep...

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Januari 2023. Rumah Gapura Candi Bentar sejatinya merujuk pada bangunan gapura yang menjadi gerbang rumah-rumah adat Bali.[1] Gapura tersebut terdiri dari dua buah candi yang serupa dan sebangun dan membatasi sisi kiri dan sisi kanan pintu masuk ke pekaranga...

Heri SutrisnoKepala Dinas Pengembangan Operasi Angkatan Udara Informasi pribadiLahirIndonesiaAlma materAkademi Angkatan Udara (1990)Karier militerPihak IndonesiaDinas/cabang TNI Angkatan UdaraMasa dinas1990–sekarangPangkat Marsekal Pertama TNISatuanKorps PenerbangSunting kotak info • L • B Marsekal Pertama TNI Heri Sutrisno, S.I.P., M.Si. adalah seorang perwira tinggi TNI-AU yang sejak 29 Juli 2022 mengemban amanat sebagai Kepala Dinas Pengembangan Operasi Angkatan U...

1971 Austrian Grand Prix The Österreichring (in 1971)Race detailsDate 15 August 1971Official name IX Großer Preis von ÖsterreichLocation Österreichring, Spielberg, Styria, AustriaCourse Permanent racing facilityCourse length 5.911 km (3.673 miles)Distance 54 laps, 317.347 km (198.686 miles)Weather DryPole positionDriver Jo Siffert BRMTime 1:37.44Fastest lapDriver Jo Siffert BRMTime 1:38.47PodiumFirst Jo Siffert BRMSecond Emerson Fittipaldi Lotus-FordThird Tim Schenken Brabham-Ford Lap lea...

25°24′21.87″N 43°33′50.46″E / 25.4060750°N 43.5640167°E / 25.4060750; 43.5640167 جبل خزاز الموقع السعودية  تعديل مصدري - تعديل   جبل خزاز يقع جنوب منطقة القصيم قريبًا من دخنة وسط الجزيرة العربية.[1] وعلى اسم هذا الجبل، سميت وقعة خزاز أو خزازى أويوم خزاز، وهي وقعة قديمة قبل الإسلام. و...

Гран-прі Туреччини Істанбул Парк Місце проведення  Туреччина, Стамбул Більше всього перемог у Гран-прі: Пілот  Феліпе Масса (3) Конструктор  Феррарі (3) Останні перегони (2011): Кіл 58 Довжина кола 5338 м Дистанція 309 396 м Переможець Себастьян Феттель Red Bull-Renault 1:30:17.558 П...

1991 studio album by Gloria EstefanInto the LightStudio album by Gloria EstefanReleasedJanuary 22, 1991 (1991-01-22)Recorded1990StudioCrescent Moon Studios, Miami, FloridaGenrePopLength52:07LabelEpicProducerEmilio Estefan, Jr.Jorge CasasClay OstwaldGloria Estefan chronology Éxitos de Gloria Estefan(1990) Into the Light(1991) Greatest Hits(1992) Singles from Into the Light Coming Out of the DarkReleased: January 10, 1991 Seal Our FateReleased: April 1991 Can't Forget Yo...

Signos ortográficosSignos de puntuación coma punto punto y coma dos puntos signos de interrogación signos de exclamación puntos suspensivos paréntesis corchetes comillas rayaSignos auxiliares tilde diéresis apóstrofo asterisco virgulilla guion llave barra calderón signo de párrafo.[editar datos en Wikidata] Los dos puntos (:) son un signo ortográfico de puntuación que se representa como un punto arriba de otro, y están puestos en la parte inferior y superior de la línea...

1984 EP by TNTTNTEP by TNTReleased1984RecordedJanuary 1984 in Nidaros Studios, TrondheimGenreHeavy metal, hard rockLabelPolyGramProducerBjørn Nessjø TNT is a 1984 English-language EP based on a selection of Norwegian-language songs from TNT's debut album TNT, released two years previously.[1] During Norwegian tours backing their 1982 Norwegian language debut, TNT experienced growing interest from abroad, and decided to record the first five songs, side 1 from their album in ...

Genus of gastropods Margarella Shell of Margarella antarctica (syntype at MNHN, Paris) Scientific classification Domain: Eukaryota Kingdom: Animalia Phylum: Mollusca Class: Gastropoda Subclass: Vetigastropoda Order: Trochida Superfamily: Trochoidea Family: Calliostomatidae Genus: MargarellaThiele, 1893[1] Type species Margarella expansa G.B. Sowerby I, 1838 Synonyms[2] Margaritella Thiele, 1891 (Invalid: junior homonym of Margaritella Meek & Hayden, 1860; Margarella is a r...

此條目需要編修,以確保文法、用詞、语气、格式、標點等使用恰当。 (2016年8月27日)請按照校對指引,幫助编辑這個條目。(幫助、討論) 此條目需要擴充。 (2016年8月27日)请協助改善这篇條目,更進一步的信息可能會在討論頁或扩充请求中找到。请在擴充條目後將此模板移除。   关于其他叫張奮的人物,請見「張奮」。 張奮(1世纪?—102年),字稺通,中国东汉...

Duke and Prince of the Franks (635–714) Pepin II redirects here. For the king of Aquitaine, see Pepin II of Aquitaine. This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Pepin of Herstal – news · newspapers · books · scholar · JSTOR (October 2017) (Learn how and when to remove this template message) Pepin Ι...

Cultural center and neighborhood in Wayne, Michigan, United StatesMidtown DetroitCultural center and neighborhoodMidtown Detroit encircled by freewaysCoordinates: 42°21′2″N 83°3′34″W / 42.35056°N 83.05944°W / 42.35056; -83.05944Country United StatesState MichiganCounty WayneCity DetroitArea • Total5.4 km2 (2.09 sq mi) • Land5.4 km2 (2.09 sq mi) • Water0.0 km2 (0.00 sq ...

1990–1991 conflict between Iraq and a 42-country coalition This article is about the conflict against Iraq in 1990–1991. For other conflicts with the same name, see Gulf War (disambiguation). Desert Storm redirects here. For other uses, see Desert Storm (disambiguation). Gulf WarClockwise from top: USAF F-15Es, F-16s, and an F-15C flying over burning Kuwaiti oil wells; British troops from the Staffordshire Regiment practicising casualty evacuation; camera view from a Lockheed AC-130; the ...

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Agustus 2020. Yvonne IngdalLahir10 Desember 1939 (umur 83)DenmarkPekerjaanAktrisTahun aktif1963-1974 Yvonne Ingdal (lahir 10 Desember 1939) adalah seorang aktris asal Denmark. Ia tampil dalam 22 film dan acara televisi antara 1963 dan 1974. Ia membintangi ...

バリー・バナン 名前ラテン文字 Barry BANNAN基本情報国籍 スコットランド生年月日 (1989-12-01) 1989年12月1日(34歳)出身地 エアドリー身長 170cm選手情報在籍チーム シェフィールド・ウェンズデイFCポジション MF (CMF)背番号 10利き足 左足ユース2002-2004 セルティック2004-2008 アストン・ヴィラクラブ1年 クラブ 出場 (得点)2008-2013 アストン・ヴィラ 61 (1)2009 → ダービー (loan) 10 (1)2...