Security concepts

Websites and online apps can be accessed by anyone, from anywhere in the world. That is one of the things which makes the internet so useful, but it also means that anyone, anywhere can attempt to break into your site.

The most common hacks use the website's own interface, for example the login page. Simply typing in a specially constructed username can break a site which has very poor security. Here are a few basic concepts around webs security.

Input validation

Input validation simply means checking everything the user types in to make sure it is basically correct. Here are some simple checks:

An email address must contain an "@"" symbol, and must end in ".something" (e.g. .com, .org etc).
A phone number must contain only digits.
A credit card number has to have exactly 16 digits.

Attackers might attempt to input invalid data in the hope that it will cause the program to crash and perhaps allow the attacker to access the server.

In fact, input validation is a good idea for any program. If a user types in invalid data, it is better to detect the mistake and inform the user, otherwise the program will not give the expected result, and the user will have no idea what went wrong.

Input sanitisation

Sometimes an attacker might enter an invalid value, for example as a username or password, which has been specially designed to trick the system into executing the attackers code.

A common example is SQL injection. The attacker crafts the input data to resemble an SQL command. If this is executed, it might reveal something about the systems database. Through trial and error, they might eventually be able to access user data. Or if the attacker is just attempting to cause disruption, they might just delete the database, bringing the website down until it is restored from backup.

These attacks usually require certain punctuation characters to be included in the text the attacker supplies. Special characters such as a semicolon, backslash etc are required.

The idea of input sanitation is to ensure that only permitted characters are present in any data supplied by the user. So, for example, if the username is only allowed to contain letters, numbers and underscore characters, the input sanitation code will remove any other characters from the input:

sanitise

Blacklists and whitelists

Sometimes blacklists and whitelists are used for input validation or sanitisation.

A blacklist is a list of values which are not allowed. For example, suppose you run an online forum, and you find that you are getting a lot of spam post. Closer examination shows that most of that spam is coming from free email accounts. You could set up a blacklist which lists the domains which generate most of the spam, and prevents anyone with a matching email from joining.

This won't get rid of all the spam, but it might help a lot, and it doesn't prevent any genuine users from signing up with a proper email address. A blacklist gets rid of most of the stuff you don't want, without banning the stuff you do want.

blacklist

A whitelist is a list of values which are allowed - anything not on the list is not allowed. For example, suppose you were setting up a web browser for a primary school. You only want the pupils to be able to access websites that have been checked by teachers to make sure they are suitable. You would use a whitelist to list all the sites that the pupils can look at. Anything not on the list is banned automatically.

A whitelist gets rid of all the sites you don't want, but often throws out a lot of perfectly good site at the same time.

Authentication

Authentication simply means proving that you are allowed to access a particular system.

You are often required to authenticate yourself before accessing a website. In most cases, that just means entering a username and password.

{{% purple-note %}} The most secure form of authentication is based on 3 factors:

A secret (something you know).
Your identity (something you are).
A physical object (something you have).

It is quite rare for a system to use all 3 factors, many use only one. For example:

Getting into your house requires just the door key (a physical object). If a thief steels the key he doesn't need to identify himself or know a secret to unlock the door.
Travelling abroad requires a passport (a physical object), and also your identity (your face must match the photograph - something you are). No secret is required.
Getting money from a cash machine requires a card (a physical object) and a secret (the PIN number). You don't have to identify yourself.

When you log onto a website your normally need a username (identity) and a password (a secret). But of course your username isn't much of an identity, on many websites you can make up a name. The password just prevents someone else impersonating your username.

{{% green-note %}} For online banking sites, more security is required because an attacker could steal money from your account. Banks will often take steps to verify your email address. For extra security, when you transfer money, they might also send an authorisation code to your phone, which you have to enter on the website.

This uses all 3 factors - your identity (your verified email address), your secret (the password), and something you have (your phone). {{% /green-note %}}

Security concepts

Input validation

Input sanitisation

Blacklists and whitelists

Authentication

Creating robust programs

See also

Links