Every newly deployed Web application creates a new security hole in accessing of your organization’s data. The hackers get access to the data that are supposedly hidden behind the firewalls by entering through the ports used by Web application. There is no way to guarantee that your Web application is 100% secure. If it was never attacked by hackers this means that it’s too small, has almost no users, and is of no interest to them.
This chapter is a brief overview of major security vulnerabilities that Web application developers need to be aware of. There are plenty of books and online articles that cover security, and enterprises usually have dedicated people to handle security issues for the entire organization. Dealing with security threats is their bread and butter and this chapter won’t have any revelations for security professionals. But software developers should have at least a broad understanding of what makes a Web application more or less secure and which threats Web applications face - this is what this chapter is about. To implement any of the security mechanisms mentioned in this chapter you’ll need to do additional research.
Tip
|
A good starting point for establishing security process for your enterprise project is Microsoft’s Web site Security Development Lifecycle. It contains a number of documents describing the software development process that helps developers build more secure software and address security compliance requirements while reducing development cost. |
Imagine a popular night club with a tall fence and two entry doors. People are waiting in lines to get in. The door number 80 is no guarded in any way - a college student checks the tickets and let people in. The other door has the number 443 on it, and it’s protected by an armed bully letting qualified people in. The chances of unwanted people getting into the club through the door 443 are pretty slim (unless the bully is corrupted), which is not the case with door 80 - once in a while people that have no right to be there get inside.
On a similar note, your organization created network security with a firewall (the fence) with only two ports (the doors) opened: 80 for HTTP requests and 443 for HTTPS. One door’s not secure, the other one is.
Warning
|
Don’t assume that your Web application is secure if it’s deployed behind the firewall. As long as there are opened ports that allow external users accessing your Web application, you need to invest into the application security too. |
The letter "s" in https means secure. Technically, https creates a secure channel over insecure Internet connection. In the past, only the Web pages that dealt with logins, payments or obviously sensitive data would use URL’s starting with https. Today, more and more Web pages use https and rightly so, because it forces Web browsers to use Secure Socket Layer (SSL) or its successor Transport Layer Security (TLS) protocol for encrypting all the data (including request and response headers) that travel between the connected Internet resources. The O’Reilly book "High Performance Browser Networking" by Ilya Grigorik contains a chapter with detailed coverage of the TLS protocol.
The organization that runs a Web server creates the public key certificate that has to be signed by a trusted certificate authority (otherwise browsers will display invalid certificate warnings). The authority certifies that the holder of the certificate is a valid operator of the Web server. SSL/TLS layers authenticate the servers using these certificates to ensure that the browser’s request is being processed by the proper server and not by some hacker’s site.
When the client connects to the server via https, the client offers to the server a list of supported ciphers (authentication-encryption-decryption algorithms). The server replies with the sipher they both support.
Tip
|
The annual Black Hat computer security conference is dedicated to the information security. This conference is attended by both hackers and security professionals. |
If https is clearly more secure than http, why not every Web site uses only https communications? Since https encrypts all the messages that travel between the client’s browser and the server, such communications are a slower and need more CPU power comparing to http-based. But this little slowness won’t be noticeable in most Web applications (unless thousands of concurrent users hit the Web server), while the benefits of using https are huge.
When entering any sensitive or private information in someone’s Web application always pay attention to the URL to make sure that it uses https protocol.
As a Web developer, you should always use the https protocol to prevent an attacker from stealing the user’s session id. The fact that the National Security Agency has broken SSL encryption algorithm is not the reason for your application for not using https.
Authentication is an ability to confirm that the user is who he claims to be. The fact the the user has provided a valid id/password combination proves that he is known to the Web application. That’s all.
Specifying the correct user id/password combination may not be enough for some of the Web application. Banks often ask for additional information like "What’s your pet’s name?" or "What’s your favorite movie?
If you ever worked for a large corporation, you know that they often use the RSA Secure ID (a.k.a. RSA hard token), which is a physical device with random-generated combination of digits. This combination changes every minute or so and has to be entered as a part of the authentication process. Beside physical devices there are programs (soft tokens) that perform user authentication in a similar way. Many financial institutions, social networks, and large Web portals support 2-factor verification - besides asking for a user ID and password, they send you either an email, voice mail, or text message with a code that you’ll need to use after entering the right id/password combination.
To make the authentication process more secure some systems check the biometrics of the user. For example, in the USA the Global Entry system is implemented in many international airports. People who successfully passed a special background check are entered into the system used in the passport control checkpoints. On the borders the applications deployed in a special kiosks scan the users passports, check the face topography and fingerprints. The process takes several seconds and the authenticated person can pass the border without waiting in long lines.
Biometrics devices become more common these days, and the fingerprint scanners that can be connected to the user’s computer are very inexpensive. Apple’s iPhone 5S unlocks the phone based on the fingerprint of its owner. In some apartment buildings you can enter the gym only after your fingerprints are scanned and matched. National Institute of Standards and Technology hosts a discussion about using biometrics web services, and you can participate by sending an email to [email protected] with subscribe as the subject.
HTTP protocol defines two types of authentication: Basic and Digest. All modern web browsers support them, but basic authentication uses base64 encoding and no encryption, which means it should be used only with https protocol.
A Web server administrator can configure certain resources to require basic user authentication. If a Web browser requests such protected resource while the user didn’t login to the site, the Web server (not your application) sends the HTTP response containing HTTP status code 401 - Unauthorized and WWW-Authenticate: Basic. The browser pops up the login dialog box. The user enters the ID/password, which are turned into an encoded userID:password string and sent to the server as a part of HTTP header. Basic authentication provides no confidentiality because it doesn’t encrypt the transmitted credentials. Cookies are not used here.
With digest authentication, the server also responds with 401 (WWW-Authenticate: Digest). However, it also sends additional data which allows the Web Browser to apply a hash function to the password. Then the browser sends encrypted password to the server. Digest authentication is more secure than the basic one, but it’s still less secure than authentication that uses public keys or Kerberos authentication protocol.
Tip
|
The HTTP status code 403 (Forbidden) differs from 401. While 401 means that the user needs to login to access the resource, 403 means that the user is authenticated, but his security level is not high enough to see the data. For example, not every user role is authorized to see the Web page with salary report. |
Pretty often an enterprise user has to work with more than one corporate Web applications, and maintaining, remembering, supporting multiple passwords should be avoided. Many enterprises implement a single sign-on (SSO) mechanism to eliminate the need for the user to enter his login credential more then once even if the user works with multiple applications. Accordingly, if the user signs out from one of these applications, terminates his access to all of them. SSO solutions makes authentication totally transparent to your application.
Typically, when the user logs on to your application, the logon process is intercepted and handled by pre-configured single sign-on software (e.g. Oracle Enterprise Single Sign-On, CA SiteMinder, IBM Security Access Manager for Enterprise SSO, or Evidian Enterprise SSO). The SSO infrastructure verifies user’s credentials by making a call to a corporate LDAP server and creates a user’s session. Usually a Web server is configured with some Web agent, which will add the user’s credential to the HTTP header, which your application can fetch.
The future access to the protected Web application is provided automatically (by the SSO server)without even displaying a logon window as long as the user’s session is active. SSO servers also log all login attempts in a central place, which can be very important to meet the enterprise regulatory requirements (e.g. Sarbanes-Oxley in financial industry or medical confidentiality in the insurance business).
In the consumer-oriented Internet space single (or reduced) sign-on solutions become more and more popular. For example, some Web applications allow reusing your Twitter or Facebook credentials (provided that you’ve logged on to one of these applications) without the need to go through additional authentication procedures. Basically, your application can delegate authentication procedures to Facebook, Twitter and other authorization services.
Back in 2010, Facebook has introduced their SSO solution that helps millions of people log on to other applications. This is especially important in the mobile world, where users' typing should be minimized. Instead of asking the user to enter credentials, your application can show the button "Login with FaceBook".
Facebook has published JavaScript API that allows implementing FaceBook Login in your Web applications(they also offer native API for iOS and Android apps). For more details visit online documentation on FaceBook Login API.
In addition to Facebook other popular social networks offer SSO:
-
If you prefer your application to have the button "Login with Twitter", refer to the Sign in with Twitter API documentation.
-
LinkedIn is a popular social network for professionals. It also offers API to create the button "Sign in with LinkedIn". For details visit LinkedIn online documentation for developers.
-
Google also offers the authentication API. Details about their client library for JavaScript are published online. For implementing SSO with Google, visit this Web page.
-
Mozilla offers a new way to sign-in using any of your existing email addresses using Persona.
Note
|
In most of today’s enterprise contexts, your organization would not want you to use social networking logins. But some enterprises started to integrate their applications with social networks so SSO via social networks will become more and more popular. |
It might sound obvious, but we’ll still remind you that the Web client should never send passwords in clear text. Use Secure Hash Algorithms (SHA). Longer passwords are more secure, because if an attacker will try to guess the password by using dictionaries to generate every possible combination of characters (brute-forcing attack), it’ll take a lot more time with long passwords. Periodical changing of the passwords makes the hacker’s work more difficult too. Typically, after successful authentication the server creates and sends to the Web client the session ID, which is stored as a cookie on the client’s computer. Then, on each subsequent request to the server Web browser will place the session if in the HTTP request object and send it along with each request. So technically, user’s identity is always available at the server side, so the server-side code can re-authenticate the user more than once (without the user even knowing it) whenever the Web client requests the protected resource.
Tip
|
Salted hashes increase security by adding salt - a randomly generated data that’s concatenated with the password and then processed by a hash function. |
Have you ever wondered why Automated Teller Machines (ATM) often ask you to enter PIN more then once? Say, you’ve deposited a check and then want to see the balance.on your account. After the check deposit has been completed your ATM session was invalidated to protect the careless users who may rush out from the bank in a hurry as soon as the transaction is finished. Otherwise the next person by the ATM could have requested a cash withdrawal from your bank account.
On the same note, if the Web application’s session is idling more than allowed time interval, the session should be automatically invalidated. If a trader in a brokerage house is not interacting with the Web trading application for some time, invalidate the session to exclude the situation when the trader stepped out, and someone else is buying financial products on his behalf.
Authorization is a way to determine which operations the user can perform and what data he can access. For example, the owner of the company can perform money withdrawals and transfers from the online business bank account, while the company accountant is provided with the read-only access.
Note
|
Similarly to authentication the user’s authorization can be checked more than once during the user’s session. As a matter of fact, authorization can even change during the session (e.g. a financial application can allow trades only during the business hours of the stock exchange). |
Users of the application are grouped by roles, and each role comes with a set of privileges. The user can be given a privilege to read and modify certain data, while other can be hidden. In the relational DBMS realm there is a term row-level security, which means that the same query can produce different results to different users. Such security policies are implemented at the data source level.
A simple use case where row-level security is really useful is a salary report. While the employee can see only his salary report, the head of department can see the data of all subordinates.
Authorization is usually linked with the user’s session. HTTP is stateless protocol, so if a user retrieves a Web page from a Web server, and then goes to another Web page, this second page does not know what has been shown or selected on the first one. For example, in case of an online store the user adds an item to the shopping cart and moves to another page to continue shopping. To preserve the data needed to more than one Web pages (e.g. the content of the shopping cart) the server-side code must implement session tracking. The session information can be passed all the way down to the database level when need be.
Note
|
Session tracking is controlled by the servers. If you’d like to get familiar with session tracking option in greater details, consult the product documentation for the server or technology being used with your Web application. For example, if you use Java, you can read Oracle’s documentation for their WebLogic server that describes the option on session management. |
Most likely you ran into the web applications that offer you to share your actions via social networks. For example, you just made a donation and want to share this information via social networks. If, say a charity application needs to access the application the user’s Facebook account for authentication and authorization, the charity app may ask the user’s ID and password for the Facebook. This is not the right approach, because the charity app gets the user’s Facebook id/pwd in clear text plus it gives the charity app complete access to the user’s account in Facebook. But the charity app only needed to authenticate the Facebook user. There was a need for a mechanism to give a limited access to third party applications.
OAuth became such a mechanism. Oauth is "An open protocol to allow secure authorization in a simple and standard method from web, mobile and desktop applications". Its current draft specification provides the following definition:
The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf.
OAuth allows users to give limited access to third-party applications without giving away their passwords. The access permission is given to the user in a form of token with limited privileges and for a limited time. Coming back to our example of communication between the charity app and Facebook, the former would get a limited privilege - just to post a message on Facebook on the user’s behalf.
Any communications with OAuth 2.0 servers are made through the https connections. Below are the main actors of the OAuth workflow:
-
The user who wants to use some service is called resource owner.
-
The application that tries to authenticate the resource owner is called the client. This is an application that offers the buttons like "Login with FaceBook", "Login with Twitter" and the likes.
-
The resource server is a server that has implemented OAuth API for the client. Facebook, Google, Windows Live, Twitter, GitHub are some of such servers. For the current list of OAuth 2.0 implementations visit oauth.net/2. To implement OAuth in your JavaScript code, you need to pick a resource server and read the appropriate section in their documentation.
The process of using Facebook server is described in the document titled Getting Started with Facebook Login for Web.
In short, you start with creating an application ID on the Facebook App Dashboard, and then copy/paste a JavaScript SDK code (provided by Facebook) into your application. Include the newly created app id there too. Then add a JavaScript code to support Facebook login to your application and the URL of the redirection page in case of successful login.
Facebook Login API will continue communicating with your application by sending events as soon as the login status changes. Facebook will send the authorization token to your application’s code and will take care of its storage. Authorization token is a secure random string that identifies the user and the app, contains the information about permissions and has expiration time. Your application’s JavaScript code makes calls to Facebook SDK API, and each of these calls will include the token as a parameter.
OAuth has provisions for creating authorization tokens to browser-only applications, for mobile applications, and for the server-to-server communications. For the in-depth coverage get the O’Reilly book by Aaron Parecki "OAuth 2.0: The Definite Guide".
Open Web Application Security Project (OWASP) is an open source project focused on improving security of Web applications. OWASP is a collection of guides and tools for increasing security of Web applications. OWASP publishes and maintains the list of top 10 security risks. Figure Top 10 security risks circa 2013 shows how this list looked in 2013:
This Web site allows you to drill down on each of the items from this list, see the illustration of the selected security vulnerability and recommendations on how to prevent it. You can also download this list as a PDF document. Let’s review a couple of the top 10 security threats: injection and cross-site scripting.
If a bad guy will be able to inject a piece of code that will run inside your Web application, such code can steal or damage the data from this application. In the world of compiled libraries and executables injecting malicious code would be a rather difficult task. But if an application uses interpreted languages (e.g. JavaScript or clear text SQL) the task of injecting malicious code becomes a lot easier than you might think. Let’s look at a typical example of SQL injection.
Say your application can search for data based on some keywords the user enters into a text input field. For example, to find all donors in the city of New York the user will enter the following:
"New York"; delete from donors;
If the server side code of your application would be simply attaching the entered text to the SQL statement, this could result in execution of the following command:
Select * from donors where city="New York"; delete from donors;
This command doesn’t require any additional comments, does it? Is there a way to prevent the users of you Web application from entering something like this? The first thing that comes to mind is to not allow the user to enter the city, but force him to select it from the list. But such a list of possible values might be huge. Besides, the hacker can modify the HTTP request after the browser sends it to the server.
Tip
|
Always use pre-compiled SQL statements that use parameters to pass the user’s input into the database query (e.g. the PreparedStatement in Java). |
The importance of the server-side validation shouldn’t be underestimated. In some scenarios you can come up with a regular expression that checks for the matching patterns in the data received from the clients. In other cases you can write a regular expression that invalidate the data if it contains SQL (or other) keywords that leads to modifications of the data on the server.
Tip
|
Always minimize the interval between validating and using the data. |
In the ideal world the client-side code should not even send the non-validated data to the server. But in real-world you’ll end up with duplicating some of the validation code in both the client and server.
Cross-site scripting (XSS) - the user of your Web application receive some unwanted code fragments from a malicious server that reaches the user via the site that a person visited (hence cross-site). Single-page AJAX-based applications makes lots of under-the-hood requests to the servers, which increases the attack surface comparing to traditional legacy Web sites that would be downloading Web pages a lot less frequently. XSS can happen in three ways:
-
Reflected (a.k.a. phishing) - the Web page contains a link that seems valid, but when the user clicks on it, the user’s browser receives and executes the the script created by the attacker.
-
Stored - the external attacker managed to store the malicious script on the server that hosts someone’s Web application so every user will get it as a part of the Web page and their Web browser will execute it. For example, if a user’s forum allows posting texts that include JavaScript code, a malicious code typed by a "bad guy" can be saved in the server’s database and executed by users' browsers visited this forum afterward.
-
Local - no server is involved. Web page A opens Web page B with malicious code, which in turn modifies the code of the page A. If your application uses a hash-tag(#) in URLs (e.g. http://savesickchild.org#something), make sure that before processing this something doesn’t contain anything like "javascript:somecode", which may have been attached to the URL by an attacker.
W3C has published the draft of the Content Security Policy document - "a mechanism web applications can use to mitigate a broad class of content injection vulnerabilities, such as cross-site scripting".
Tip
|
In application security the term man in the middle attack refers to the case when an attacker intercepts and modifies the data transmitted between two parties (usually the client and the server). |
Microsoft has published a classification that divides security threats into six categories (hence six letters in the acronym STRIDE):
-
Spoofing - an attacker pretends to be a legitimate user of some application, e.g. a banking system. This may be implemented using XSS.
-
Tampering - modifying the data that were not supposed to be modified (e.g. via SQL injection).
-
Repudiation - the user denies that he sent the data (e.g. made an online transaction like purchase or sale) by modifying application’s log files.
-
Information disclosure - an attacker get an access to the classified information
-
Denial of Service (a.k.a. DoS) - make a server unavailable for the legitimate users, which often is implemented by generating a large number of simultaneous requests to saturate the server.
-
Elevation of privilege - gaining an elevated access to the data, e.g. by obtaining administrative rights.
Note
|
While we’ve been working on the section describing Apple’s developers certificates (Chapter 14) their Web site was hacked and was not available for about two weeks. |
Important
|
One of the OWASP guides is titled Web Application Penetration Testing. In about 350 pages it explains the methodology of testing a Web application for each vulnerability. OWASP defines penetration test as a method of evaluating the security of a computer systems by simulating an attack. Hundreds of security experts from around the world have contributed to this guide. Running penetration tests should become a part of your development process, and the sooner you start running them the better. |
So far in this chapter we’ve been discussing security vulnerabilities from the technical perspective. But there is another aspect that can’t be ignored - the regulatory compliance of the business you automate.
During the last four years the authors of this book develop, deploy, support, and market the software that automates certain workflows for insurance agents. We serve more several hundreds of insurance agencies and more than 100K agents. In this section we’ll share with you our real-world experience of dealing with security while running our company, which sells software as service. In addition to developing the application we had to set up the data centers and take care of security issues too.
Our customers are insurance agencies and carriers. We charge for our services, and our customers pay using credit cards using our application. This opens up a totally different category of security concerns:
-
Where the credit card numbers are stored?
-
What if they get stolen?
-
How secure is the payment portion of your application?
-
How the card holder’s data is protected?
-
Is there a firewall protecting customer’s data?
-
How the data is encrypted?
One of the first questions our perspective customers ask if our application is PCI compliant. They won’t work with us until they review the application-level security implemented in our system. As per the PCI Compliance Guide, "the Payment Card Industry Data Security Standard is used by all card brands to assure the security of the data gathered while an employee is making a transaction at a bank or participating vendor".
If your application stores PCI data, authenticating via FaceBook, Google or a similar OAuth service won’t be an option. The users will be required to authenticate themselves by entering long passwords containing combinations of letters, numbers and special characters.
Even if you are not dealing with the credit card information, there are other areas where the application data must be protected. Take the human resources application - social security numbers (unique ID’s of the USA residents) of employees must be encrypted.
Pretty often our perspective customers send us a questionnaire to see if our security measures are compliant with their requirements. In some cases this document can include as many as 300 questions.
You may want to implement different levels of security depending on what devices is being used to access your application - a public computer, an internal corporate computer, iPad or an Android tablet. If a desktop user forgot his password, you may implement a recovery mechanism that send an email to the user and expects to receive a certain response from him. If the user holds a smartphone, the application can send a text message to his device.
If the user’s record contains both his email and the cell phone number, the application should ask where to send the password recovery instructions. If a mobile device runs the hybrid or native version of the application, the user can be automatically switched to a messaging app of the device so he can read the text message while the main application remains at the view where authentication was required.
In the enterprise Web applications more than one layer of security must be implemented: at the communication protocol level, at the session level, and at the application level. The HTTP server Nginx besides being a high-performance proxy server and load balancer can serve as a security layer too. Your Web application can offload authentication tasks and validation of SSL certificates to Nginx.
Most of the enterprise Web applications are deployed on the cluster of servers, which adds another task to your project plan: how to manage sessions in a cluster. The user’s session has to be shared between all servers in a cluster. High-end application servers may implement this feature out of the box. For example, IBM WebSphere server has an option to tightly integrate HTTP sessions with its application security module. Another example is Terracotta Cluster, which has the Terracotta Web Sessions module that allows session to survive the nod hops and failures. But small or mid-size applications may require some custom solutions for distributed sessions.
Tip
|
Minimize the amount of data stored in the user’s session to simplify session replication. Store the data in the application cache, that can be replicated quickly and efficiently using open source or commercial products (e.g. JGroups, Terracotta et al). |
Here’s another topic to consider: multiple data centers when each one runs a cluster of servers. To speed up the disaster recovery process, your Web application has to be deployed in more than one data centers located in different geographical regions. The user authentication must work even if one of the data centers becomes not operational.
An external computer (e.g. Nginx server) can perform token-based authentication, but inside the system the token is used only when the access to protected resources is required. For example, when the application need to process a payment, it doesn’t need to know any credit card details - it just uses the token to authorize the transaction of the previously authenticated user.
This grab bag of security considerations mentioned in this section is not a complete list of security-related issues that your IT organization needs to take care of. If you work for a large enterprise on the Intranet applications, these security issues may not sound as overly important. But as soon as your Web application starts serving external Internet users, someone has to worry about potential security holes that were not in the picture for internal applications. Our message to you is simple: "Take security very seriously if you are planning to develop deploy, and run a production-grade enterprise Web application".
Every enterprise Web application has to run in a secure environment. The mere fact that the application runs inside the firewall doesn’t make it secure. First, if you’re opening at least one port to the outside world, a malicious code can sneak in. Second, there can be an "angry employee" or just a "curious programmer" inside the organization who can inject the unwanted code.
The proper validation of the received data is very important. Ideally, use the white list validation to compare the user’s input against the list of allowed values. Otherwise do a black list validation to compare against the keywords that are not allowed in the data entered by the user.
With proliferation of clouds, social networks, and sites that offer free or cheap storage people lose control over security hoping that Amazon, Google or Dropbox will take care of it. Besides software solutions, software-as-a-service providers deploy specialized hardware - security appliances that serve as firewalls, perform content filtering, virus and intrusion detection. Interestingly enough, hardware security appliances are also vulnerable.
In any case, the end users upload their personal files without thinking twice. Enterprises are more cautious and prefer private clouds installed on their own servers, where they administer and protect data themselves. The users who access Internet from their mobile devices have little or no control of how secure their devices are. So the person in charge of the Web application has to make sure that it’s as secure as possible.