Network Security

What Is CAPTCHA? Meaning, Working, Features, and Threats

CAPTCHA is a challenge-response security test that ensures that users are human beings, not bots.

Chiradeep BasuMallick Technical Writer

March 13, 2023

CAPTCHA is defined as a challenge-response test that acts as an authentication mechanism on websites, search engines, and web applications.
It ensures that users (with or without password-based credentials) are human beings, not automated bots trying to crowd the system and cause a cyberattack.
This article provides an overview of CAPTCHA, its working, features, and possible threats.

What Is CAPTCHA?
How Does CAPTCHA Work?
Features of CAPTCHA
Threats to CAPTCHA

What Is CAPTCHA?

CAPTCHA is a challenge-response test that acts as an authentication mechanism on websites, search engines, and web applications to ensure that users (with or without password-based credentials) are human beings and not automated bots trying to crowd the system and cause a cyberattack.

Today, access to the Internet’s vast wealth of information has become necessary for all users. Internet services include free email account creation, online polls, online payments, electronic transfers, and other digital social and commercial activities.

Furthermore, these sites are often under assault by bots, which are automated computer programs. To combat this, the Completely Automated Public Turing Test to Tell Computers and Humans Apart — commonly abbreviated as CAPTCHA — was created to differentiate between bots and humans.

The term “Turing test” is central to CAPTCHAs. A Turing test evaluates a computer’s capacity to simulate human behavior. In 1950, Alan Turing, an early pioneer of computing and artificial intelligence (AI), created the Turing test. A computer program “passes” the Turing test if its actions throughout the test cannot be distinguished from that of a person, i.e., if it behaves as a human would. A Turing test is not based on answering questions correctly; rather, it is concerned with how “human” the responses seem, irrespective of whether they are accurate.

Interestingly, while it’s referred to as a “public Turing test,” a CAPTCHA is truly the inverse of a Turing test. It identifies whether a purportedly human operator is a computer program (a bot) or not, as opposed to attempting to establish whether a computer is human. Let us now explore the fundamental principles behind this widely used only security method.

See More: What Is Social Engineering? Definition, Types, Techniques of Attacks, Impact, and Trends

Understanding the meaning of CAPTCHA

CAPTCHA is a security measure that prevents bots from accessing online services. They function by providing the user with information for interpretation. The traditional CAPTCHA consists of warped or overlaying numbers and letters that the user has to enter into a form or box. The letter distortion makes it impossible for bots to comprehend the text, blocking access until the letters and numbers are validated.

This type of CAPTCHA focuses on an individual’s ability to generalize and detect unfamiliar patterns based on varying prior experiences. In contrast, robots can typically only follow predetermined patterns or enter random characters. This restriction reduces the probability that bots will accurately predict the correct combination.

Since the introduction of CAPTCHA, learning-based bots (similar to bots used in robotic process automation or RPA) have been developed by malicious players. Using pattern-recognition-trained algorithms, the bots can recognize conventional CAPTCHAs. As a result of this progress, modern CAPTCHA techniques are based on increasingly more complicated tests. For instance, reCAPTCHA asks users to click in a certain area of the screen and wait until a timeout expires.

reCAPTCHA is a more advanced CAPTCHA verification than standard CAPTCHA verification. Some reCAPTCHAs, similar to CAPTCHA, require users to submit images of unintelligible text. reCAPTCHA, unlike standard CAPTCHAs, obtains its content from real-world material, like photographs of street addresses, words from printed books, and so on.

Carnegie Mellon University researchers developed reCAPTCHA technology, which Google acquired in 2009.

See More: What Is a Secure Web Gateway? Definition, Benefits, and Best Practices

The evolution of CAPTCHA

In 1997, Alta-Vista, a search engine, sought a technique to prevent automated uniform resource locator (URL) inputs to its search engine. This was the first time anything like CAPTCHA was used.

Andrei Broder, the principal scientist of Alta Vista, developed a system that randomly produced a picture of written text, the first example of the CAPTCHA method. Computers couldn’t identify it, but people could read and write it into a text field with relative ease. The majority of online service providers after that adopted this innovative technique.

In 2000, experts at Carnegie Mellon University developed the algorithm further and dubbed the resulting technology CAPTCHA. In April 2001, a patent was granted to Andrei Broder with his colleagues. Since then, there has been a continual and intensive effort in this field, but there is still an ongoing need for new technologies or enhancements to existing techniques.

Applications of CAPTCHA

CAPTCHAs are employed by all websites that wish to prevent automated access. Some common illustrations of CAPTCHA include:

Securing online polls: By verifying that every vote is cast by a person, CAPTCHAs help prevent poll skewing. While this does not restrict the total number of votes, it increases the time necessary for each vote, discouraging repeated voting.
Restricting registration to real users: CAPTCHAs can be used by services to prevent bots from flooding registration systems and generating bogus accounts. Restricting the creation of accounts avoids the waste of a service’s resources and reduces the risk of fraud.
Protecting demand-based pricing systems: Using CAPTCHA, ticketing systems may prohibit scalpers from acquiring large quantities of passes or tickets for reselling, leading to price inflation. It could also be employed to prevent fraudulent event registrations.
Authenticating comments: CAPTCHAs may prevent bots from flooding internet forums, feedback forms, and review sites. The additional element necessitated by a CAPTCHA may also assist in containing online abuse by inconveniencing such cyber threat actors.]

Drawbacks of CAPTCHA

A few pitfalls should also be considered while using CAPTCHA. First, a CAPTCHA evaluation may halt the flow of what users are trying to do, giving them a poor impression of their interactions on the online domain and, in some cases, forcing them to leave the web page altogether.

Most CAPTCHAs are unavailable to visually impaired users since they need visual processing. This makes them almost impossible for people with visual impairments and anyone with significantly impaired eyesight. But there are workarounds to this, as we will discuss later. Lastly, CAPTCHAs are not foolproof and shouldn’t be relied upon for comprehensive bot protection due to a few inherent security vulnerabilities. We will explain these threats to CAPTCHA later in the article.

See More: Dark Web vs. Deep Web: 5 Key Differences

How Does CAPTCHA Work?

The web and its varied services significantly impact our lives nowadays. It offers a broad range of online applications, including education, banking, shopping, and communications, among others. Most of these services demand authorization by completing an online application before allowing access to a website or web page. Unfortunately, fraudsters and hackers have created several non-authentic automated programs known as bots that automatically complete online registration.

How CAPTCHA Works

The reality is that not all “users” on the web are authentic; occasionally, they may be bots. To restrict access to just authorized individuals, several security mechanisms have been devised and implemented. The most used is the login authentication method, which uses text-based passwords to verify the legitimacy of users.

The fundamental issue with employing this approach is that malevolent applications may unlawfully hack the login password. By posing as a person while signing in, this intrusion software may access sensitive information without authorization, typically through tactics like a brute force attack.

CAPTCHA was designed to safeguard online information from unauthorized users in response to this problem. CAPTCHA is a human interactive proof (HIP) mechanism that imposes a challenge-response test that people can complete with relative ease, but spammers and bots cannot. Before gaining access to web services online, the CAPTCHA authentication method requires users to pass a test.

It is software designed to distinguish between real human users and unauthorized computer bot programs. It requires users to input sufficiently distorted language, recognize pictures, detect sounds, solve a problem, etc. Only humans can easily pass or solve these examinations.

There are a lot of CAPTCHA strategies being employed by various websites to safeguard their information. These techniques may be classified into two major categories: optical character recognition (OCR) and non-OCR CAPTCHAs. OCR-based CAPTCHAs are often text-based in that the user must enter the distorted text into the text field. Typically, non-OCR-based CAPTCHAs use graphics, music, video, animations, and reasoning to test the user’s cognitive abilities.

Apart from this, the precise working of a CAPTCHA solution will depend on its type. Here are the key types of CAPTCHAs and a brief overview of how they work:

1. Text CAPTCHAs work using distorted text

Text-based CAPTCHAs represent the most prevalent category of CAPTCHA, which requires a user to fill in a sequence of shown numbers or characters. These CAPTCHAs may include well-known phrases or words and random number and character combinations. Some text-based CAPTCHAs also contain capitalization adjustments.

The CAPTCHA presents these letters in a way that is unsettling and confusing, requiring interpretation. It is possible to alienate characters by resizing, tilting, or distorting them. You may also overlay characters with visual characteristics like color, ambient noise, dashes, arcs, and spots. This distinction safeguards against bots with inadequate character recognition systems, but it sometimes may also be difficult for humans to interpret.

2. Audio CAPTCHAs rely on difficult-to-interpret, noisy audio

Audio CAPTCHAs were developed to accommodate visually impaired users. These CAPTCHAs are typically used in combination with CAPTCHAs based on text or images. Audio CAPTCHAs broadcast an audio clip with a string of characters that the user must enter. A user may often demand that a written CAPTCHA be converted into an Mp3 format. These CAPTCHAs depend on the inability of bots to distinguish important characters from background noise. These CAPTCHAs may be difficult for both humans and computers to decode.

3. Image CAPTCHAs utilize contextual recognition

Text-based CAPTCHAs may be replaced by image-based CAPTCHAs. These CAPTCHAs employ graphical elements, like photos of wildlife, objects, or places, that are easily recognizable. Typically, image-based CAPTCHAs require users to choose images that correspond to a topic or identify images that do not correspond.

Text-based CAPTCHAs are often more difficult to decipher than image-based CAPTCHAs. However, these strategies provide major accessibility obstacles for persons with vision impairment. Picture-based CAPTCHAs are harder for bots to decipher when compared to text-based CAPTCHAs because these technologies involve both picture recognition and semantic classification.

4. Video CAPTCHAs convert text or images into moving visuals

This new type of CAPTCHA is both secure from bots and easy to use for human beings. It employs video in motion to verify actual web interactions. It is better than image-based CAPTCHAs since the authentication material is shown as a video file, which is more difficult for bots to identify.

5. Puzzle CAPTCHAs ask human users to both interpret and calculate

Typically, a puzzle-based CAPTCHA might be either a graphical or mathematical problem. In an image or photo-based puzzle, the image is broken into parts that are distributed at random. Each component of the image will be labeled with numbers. The user must correctly assemble these parts by following the numbers to recreate the original image.

A mathematics-based challenge is also extremely effective and may be readily included in login procedures and online form registrations. As part of obtaining legal authorization to view online material, the user must answer the mathematical problem (a simple high school-level arithmetic question).

See More: Top 10 Antivirus Software in 2022

Features of CAPTCHA

CAPTCHA-based user authentication has several characteristics; it will be:

Automated: By definition, CAPTCHAs are automated and require minimal human maintenance or interaction to administer. This offers clear cost and reliability advantages.
Contextual: Context is also an essential CAPTCHA component. To accurately detect each character, the CAPTCHA should be comprehended holistically. In one portion of a CAPTCHA, for instance, a letter may appear as an “m.” Only after the entire word is considered does it become evident that the letters are “u” and “n.”
Accessible: Access to the protected resource is restricted to blind and visually impaired users through CAPTCHAs based on text comprehension or other visual perception challenges. However, visual CAPTCHAs are not required. CAPTCHAs may be based on any complex artificial intelligence application, such as speech recognition. Some CAPTCHA solutions let users choose an audio CAPTCHA. Other solutions do not require users to add words but instead need them to select photos with common ideas from a randomized selection.
Built on a publicly available algorithm: The CAPTCHA algorithm must be publicly disclosed by definition; however, it may be protected by a patent. This is done to indicate that cracking it involves the answer to a difficult puzzle powered by artificial intelligence (AI), as opposed to just discovering the (secret) algorithm via reverse engineering or other code-breaking techniques.
Segmentation-based: Segmentation, or the capacity to differentiate one word from the other, is made harder by the lack of white space between letters in CAPTCHAs.
Reliant on invariant recognition: Invariant recognition is the ability to identify letters whose forms vary substantially. A human brain can effectively recognize almost unlimited variants for each character. The same cannot be said for a machine, and educating it to identify all of these unique structures is arduous.

See More: What Is a Botnet? Definition, Methods, Attack Examples, and Prevention Best Practices for 2022

Threats to CAPTCHA

As mentioned earlier, CAPTCHA is not foolproof and may be “fooled” by advanced cybersecurity threats and attacks. Three types of threats may exploit the vulnerabilities of CAPTCHA. There are also risks associated with CAPTCHA plugins that can be exploited. The threats and risks are:

Threats to CAPTCHA

1. AI trained to solve problems as a human user would

A machine learning algorithm may be trained to tackle any type of problem that humans are capable of solving. Due to the widespread availability and the open-source nature of machine learning (ML) and OCR technologies nowadays, fraudsters have adopted them in large numbers. OCR and machine learning technologies have prompted a fundamental change in the CAPTCHA technology’s reasoning since they can offer computers cognitive skills. reCAPTCHA is one method for combating this problem.

2. Browser automation tools that automatically mimic human behavior

It is exceedingly difficult to identify and prevent browser automation capabilities, which enable bot programs to seem more human-like and are incredibly tough to spot and stop. This approach allows for the execution of large-scale penetration tests, the navigation from site to site without human interaction, the evaluation of JavaScript, and the emulation of browser capabilities. Essentially, browser automation enables the execution of a complete browser version under programmable supervision. That is why it is among the top threats to CAPTCHA.

3. CAPTCHA-completion farms that solve problems in bulk

A CAPTCHA-solving factory (or farm) relates to automated identification services in which CAPTCHAs are remotely solved by human employees through an application programming interface (API). This strategy uses the basic structure of a CAPTCHA, i.e., its ability to identify actual people from automated computer systems. Since many human employees can answer CAPTCHAs en masse, they may circumvent its threat detection capacity.

4. Risks contained in CAPTCHA plugins

These plugins are ready-to-use programs that embed a CAPTCHA in a website, and they are widely used by web developers. CAPTCHA plugins are freely accessible through WordPress libraries or repositories such as GitHub. But, like with any code, these plugins will have vulnerabilities, especially if the code originates from a third- or fourth-party provider.

Cross-site scripting, which injects malicious code straight into websites, is one of the most prevalent dangers associated with plugins. It will eventually provide attackers access to sensitive browser data, including cookies, login details, and identification information. Existing vulnerabilities, such as those found in CAPTCHA plugins, can provide a straightforward method for injecting malicious code.

To prevent such threats, always verifying the source of a CAPTCHA code is crucial. Enterprises should invest in a commercial CAPTCHA solution for their web apps and e-commerce pages to ensure optimum web application security.

See More: What Is Content Filtering? Definition, Types, and Best Practices

Takeaway

CAPTCHA is now an almost ubiquitous part of the web browsing experience. These increasingly sophisticated authentication systems hunt for signs that you are human, from your ability to perform contextualized mathematical calculations to how you move a cursor over an empty web page.

However, CAPTCHA is not without its flaws. For example, a recent cyber attack campaign called Purple Urchin managed to bypass CAPTCHA mechanisms and target cloud resources. That is why web developers should use CAPTCHA only as the first line of defense and ensure they bolster security with other authentication measures.

Did this article help you understand how CAPTCHAs work? Tell us on FacebookOpens a new window , TwitterOpens a new window , and LinkedInOpens a new window . We’d love to hear from you!

MORE IN SECURITY

Image source: Shutterstock

Cybersecurity

Chiradeep BasuMallick

Technical Writer

opens a new window

Chiradeep is a content marketing professional, a startup incubator, and a tech journalism specialist. He has over 11 years of experience in mainline advertising, marketing communications, corporate communications, and content marketing. He has worked with a number of global majors and Indian MNCs, and currently manages his content marketing startup based out of Kolkata, India. He writes extensively on areas such as IT, BFSI, healthcare, manufacturing, hospitality, and financial analysis & stock markets. He studied literature, has a degree in public relations and is an independent contributor for several leading publications.

Do you still have questions? Head over to the Spiceworks Community to find answers.

What Is CAPTCHA? Meaning, Working, Features, and Threats

Table of Contents