Systems Architecture & Security, winning at both

Online systems need to be both secure and designed to last, so how can you achieve both and not blow the budget?

This article covers a few simple principals you can adopt which are both good for your systems architecture and good for your security.

#1 System Components should only do what they say on the tin and no more…

By this I mean a component or module only does the function it is named for and no more. Say, for example, if you have an EC cart component, it only does that function and no more, no registration, no sign-in, etc.

The architectural ‘win’ here is that everybody should be able to understand what the component or module (or even subsystem) does from its name alone. No black boxes, no ‘God’ modules, etc. This makes it a lot more likely that the architecture will remain consistent over time and also means for any new hires or team members, it is just that much easier to understand and by default do the Right Thing.

The other architectural win, is it helps avoid what I call ‘spaghetti dependency management’, where you have so much going in a component that if you actually trace out all the inter-dependencies between modules (calls, languages, libraries, protocols, etc) – it ends up looking like a big plate of spaghetti – minus the tasty sauce!

The security architecture ‘win’ is it encourages a proper separation of concerns and frustrates a hacker trying to get to the ‘good stuff’ in that you need to go through more systems and software than in a system which is more flat.

#2 Keep SQL out of your front end code at all costs!

A very common security ‘vector’ is known as SQL Injection, which is where SQL is ‘injected’ into a query by means of a form field and causes the SQL to do something completely different to what was expected. Could be bypassing a sign-in form to extracting data wholesale out of a database, or actually changing the data in ways you didn’t expect or can’t detect until it’s too late…

Note: Although the web form route is by far the most common mechanism, any code that builds SQL queries which can be accessed from the outside, can be at risk of a SQL injection attack – this even includes queries making use of data held in the database which was originally externally sourced but not ‘cleaned’ on the way in, this quite often happens when you chain queries together with one query ‘fed’ with the output of the other.

If you must do SQL query construction, do it well away from your Front End code, do it via some API call to a back-end service that ‘wraps’ your database. This way you have a much smaller ‘security surface’ to ensure is safe against SQL injections (btw NoSQL databases can suffer injections too) and it will probably have a lower change rate than the Front End code; and as we know when anything is changed it can be changed in a way that causes security problems.

What is a Security Surface?
A security surface is a boundary or notional wall around all systems that contribute to the successful operation of a service, for instance a website or an API – the idea being to keep this surface as small as possible and depending on the critical nature of the service under your control. It’s a way to envisage easily the degree of risk your are actually under and how much effort is required to be secure.

The architectural win here is that you now have a service around your database, so you have decoupled the consumers of the service from the implementation – leaving you ‘free’ to evolve each as needed. This means you can scale or change each as needed.

#3 Do not ‘roll your own’ Auth… Here be dragons!

This should go without saying, but unless you absolutely know what you are doing and have the qualifications to back it up, writing your own Auth service from the ground up can be full of dangers, pitfalls and just wrong assumptions. It’s much better to use an established methodology and technical stack where possible and adapt it to your needs than to try to make a secure Auth service on your own. BTW: This also includes as a solution using Open Source, these can often be very focused, up to date and are able to be incorporated into your existing systems deployments as a local service (i.e. the security surface of your systems do not extend out to a third party you do not control).

What is Auth?
Auth is short for Authentication – basically verifying who you are by some set of challenges only you can answer, usually a username and password, but more commonly now with another challenge by a different method (2FA – two factor authentication).

Remember, all the core crypto and auth algorithms in the use on the Internet are in the public domain and are Open Source – it helps ensure quality and should not be considered a negative marker against their use at all – you are literally reading this page (and pretty much everything else using HTTPS) via Open Source security code…

BTW#2: See this article for concerns around SLA’s with start-up SAAS providers in the Auth space, its rapidly evolving and you need to be careful – there are some real dragons here…

This will also ensure you do not ‘mix-in’ your Auth data with your guest data, i.e. make it easy for someone to get all their Auth data and personal data at the same time, say all existing in the same table that would make it easy pickings by a SQL injection. Auth is a shared service that should exist ‘outside’ of projects and services and be something accessed via an established API – this also makes it very easy to monitor usage and ‘adjust’ your Auth service in response to your security needs over time.

#4 Don’t put all your eggs in one basket

A common mistake I see is when a group of systems are put together on one a set of servers (or sometimes even one!) in the mistaken belief that in making things easier for development and ops, they are somehow better. The big problems here are:

  • It’s a false constraint – in that you are using an implied commonality (be it language or environment) to place unlike systems with different functions together;
  • Your risk of failure from change events is magnified – simply put their shared operating environment becomes a common mode of failure for everything at once. The usual response to this from Ops is to be very slow in applying updates (as any good ops person knows changes can lead to failure), which leads to..
  • Increased security risks – the fact the base system is often months behind the security patches (and other changes that could be applied) means its open to attack. Also the fact everything is running in the one environment means a successful attack on one often makes it easy for hackers to move laterally through the ‘basket’ of systems and compromise a lot of systems at one go… Also if different systems in the same basket are in different security ‘domains’ – its a serious breach of the security model of the business (which could have legal implications if PII, financial or health data is involved).
  • Increased risk of business failure – All this also means rebuilding the ‘basket’ to a known good state is that much more difficult, so the recovery time from a downtime causing hack will be that much longer, it might even be business fatal if the business logic and state wasn’t properly captured and backed up.
  • Otherwise unrelated systems ‘fight’ for resources – Yes, you could have this as part of an auto-scale group, but remember what you are scaling on could just be result of the ‘in fighting’ on CPU, IO & memory in such a ‘one basket’ situation – if the systems were properly decoupled its actually quite likely you could achieve the same scaling with less expensive resources.
  • It’s just a bad development practice – Engineers end up getting ‘stuck into a rut’ on how they view the systems and as a result start tying things together in ways they shouldn’t – making it even more difficult to separate things out and get proper scalability and reliability. Tech gets developed on the basis of how easy it is rather than whether it’s the right thing to be doing in the first place. In essence the systems become more fragile to change than they should otherwise be.

So if you see a server or set of servers in your architecture with a basket of ‘mixed systems’ – you need to pay that very close attention as it could be a potential business disaster in the making. Quite often this ‘one basket’ approach is done as a cost saving measure, but the cost of operating a small ‘fleet’ of servers could well be less (the prices of server instances on the cloud keep dropping all the time, in some cases its down to a few cents per hour, well less than $20 a month) and certainly much less than the cost to the business brand and bottom line of a critical failure event.

#5 Centralize your logging, know what is happening on your systems at all times…

Again, this is similar to the Auth case, but you should set your system or service up so that the logs all get copied to one central location or service. Again look for standard solutions to this, it will make the job of examining logs and putting in alerts so much easier. It also means if you do suffer an attack, you have a record independent of the system attacked, so you can be secure it didn’t get modified during the attack.

This is critical if you deal with any form of sensitive information (such as names, addresses, telephone numbers, account numbers, etc) – you need to prove what happened when and why – good independent logs are core to this.


I hope these points have been of help to you in your architecture and security design. I f you have any questions, or would like some help, please feel free to get in touch .