This is a Patreon-supported essay. Drafts of all of these essays go out a week early to my $10 and up subscribers. Building secure applications is hard, and for organizations that have never done it before, it's often unclear where to even start. Worse, organizations that have some development experience often underestimate the work required to ship secure code. My goal with this essay is to make the landscape more legible and give NGOs and other organizations an idea of where to start. This is a part two of a four part essay; part one is here, part three is here, and part four is here.
I have a Patreon, here, where you can subscribe to support my security and systems-focused writing. You sign up for a fixed amount per essay (with an optional monthly cap), and you'll be notified every time I publish something new. At higher support levels, you'll get early access, a chance to get in-depth answers to your questions, and even for more general consulting time.
© 2021 Eleanor Saitta.
This is part two of my guide to secure software development for NGOs and other organizations. You can find part one here and parts three and four here and here. In the previous section, we looked at the lifecycle of software, the organization creating it, and the design process and how it impacts security. In this section, we'll look at everything that comes between design and actually writing code.
Once you understand the requirements the system is intended to implement, it's time to start building your threat model. A threat model is a formal, human-readable model of all of the security-relevant, in-scope parts of the system.
Threat models come in two parts — the requirements model and the architecture model. In the requirements model, you want to understand everything that can go wrong in the system at the level of user tasks and goals, how bad those negative outcomes are, relatively speaking, and what the system's response should be if an adversary tries to make one of those negative outcomes happen. The requirements model also enumerates all of the roles in the system, the assets (the things you're trying to operate on or protect), and who's allowed to do what to what. The requirements threat model draws on and complements requirements documentation, especially the security requirements and identified security properties. The threat model formally shows what those security properties mean within the rest of the requirements and the architecture.
The goal of the requirements-level threat model is to create a set of security objectives. The security objectives concisely encapsulate the security-relevant responses of the system to adversaries and will shape mitigations selected during architectural design.
If you did your work correctly while developing the requirements, the threat model should go quickly. You shouldn't wait to finish the requirements to start threat modeling, however, even if you aren't doing agile development. The requirements and the threat model should be developed together iteratively. If you are doing agile development, the threat model must be a living document and threat model updates should be a part of every user story. In all cases, the threat model should be a key element for communicating security goals across the development team.
A world on formality — when I say formal in a threat model, I mean something where you can understand procedurally if your threat model is complete and internally consistent. This isn't the same as “formal verification”, which attempts to create mathematical proofs of correctness from source code. Threat models are just models, and there's some intentional fuzz between the model and the implementation. The goal of a threat model is to represent architecture and security intentions in detail so they can be analyzed and compared against the implementation, not to directly operate on the implementation. Among other things this means we can build threat models in a reasonable amount of time, something rarely true of proofs. That layer of fuzz means we can spend more time thinking about the complexities of human intent, which is the hard part of analyzing requirements and architecture for security.
Architecture is where the hand-off from the design team to the development team starts. Of course, the idea that either design or architecture can occur in isolation from each other is a myth. The design team needs to know what architecture and development can support or enable. The more novel the system, the more interaction there will need to be between design and architecture.
During architectural design the system is decomposed into a set of components that interact to satisfy the requirements. In agile development, the basic structure of the core architecture will still probably be determined up front, even if low-level components are swapped out as development progresses. In general, architectures mirror team structure. Open source applications developed by distributed teams will often look more like loosely-coupled sets of libraries, and applications created by unified development teams working together in person are more likely to be single unified systems. Neither model will necessarily result in a better or more secure outcome. A good fit between team and architecture can lead to a smoother development process.
The architectural design process is where security properties specified in the design phase must be turned into first system-wide technical design problems and then, as these are solved, into concrete implementation specifications. The kind of system you're building determines how much detail is needed in architectural documentation. That said, more thorough and readable documentation will help bring new programmers onto the team more quickly and make security testing easier, provided it's kept up to date as the system evolves. Good architecture documentation helps programmers write code with fewer bugs. At the very least, all security concerns and all places where system rules need to be enforced must be documented.
All systems involve parsing some kind of input, whether that's a binary image format like a JPEG, input a user types on a keyboard, or JSON data received over the Internet. All systems involving operations over a network (and even many that don't) have to worry about protocols, whether they're speaking HTTPS on the web or sending data to another device over a USB port. The protocols you choose constrain the security properties your system can provide. Be certain you understand what the security properties of the protocols you use are and what the requirements for maintaining those properties are, both for developers and users. Developing new protocols is hard and time-consuming work. If you don't need to do it, don't.
Parsers, the code that recognizes input data and manages the state of protocols, are the single biggest source of low-level vulnerabilities in systems. Every input format you use and every protocol your application speaks needs to have a formal specification, and every parser you use should be programmatically generated from such a specification. If you get this right up front you'll make your life much easier later. There are libraries that can handle the parser generation for you, but during the architecture stage, you need to write or adopt specifications for your protocols. It can be tempting to skip this step, especially if you're planning on just using third-party libraries that already implement protocols for you. While in general you should never implement a protocol yourself if a good, tested implementation exists that meets your needs, all else being equal, favor implementations that use generated, known-complete and correct parsers over those that use hand-written parsers. If you want to know more about parsers and security problems, talk to @maradydd and the folks at langsec.org.
Your system almost certainly does not require any novel cryptographic primitives. If it does, you have likely misunderstood the problem. If you are doing anything more complex than using off-the-shelf cryptographic libraries in well-documented ways, you definitely need expert help. In some specific scenarios, there may be a need to combine existing cryptographic primitives in less-common ways. If so, the section "Selecting Cryptographers" below is for you. Under no circumstances should your team ever attempt to implement cryptographic primitives themselves. Doing so guarantees you will screw up, probably tragically, and cause significant harm if your application sees wide adoption.
When selecting which primitives and key sizes to use, it's important to do the research into what's currently recommended as these things do change somewhat regularly. The list of things that you need to think about also varies depending on which primitives you're using. While you always need to know, for instance, that you're using a cryptography-appropriate random number generator, you only need to remember to sign first and encrypt second if you aren't using an authenticated encryption scheme that solves this for you. Especially for cryptography, favor systems that require you to keep track of fewer security-critical properties.
In general, whatever @hashbreaker and @matthew_d_green can agree on is what you should use. If they disgree, @mattblaze can be a tiebreaker. If you're building centralized systems that don't use end to end encryption (where clients control the keys themselves), you should probably just use TLS and make sure that you get an A grade from SSLLabs.com. If you're doing anything else, you want libraries that have already done most of the thinking for you — something like the NaCL library already has most of the choices you might otherwise struggle with baked in with sensible defaults. The less cryptographic code you write yourself, the fewer chances you have to screw up, assuming the library you use is well-tested and vetted. That last part is a big caveat. OpenSSL, while standard across the industry, has still had many serious bugs recently because it wasn't as well-tested as we thought. Worse, it gives you a lot of ways to shoot yourself in the foot. All else being equal, smaller, simpler libraries are often more useful, especially if they protect you from low-level details.
The cryptographic systems and protocols that you adopt heavily influence the security properties of your system. You need to be thinking about cryptography and protocols in general terms starting from at least the security design phase of development. Changing protocols or swapping a primitive out for something that's not directly equivalent can, from a security perspective, be as large a change as completely re-architecting your system on the back-end.
As Matt Green and Dan Bernstein are often busy, if you need to do any low-level cryptographic implementation or review you'll probably need to go with someone else. The best way to figure out who to hire is probably to delegate the decision to a reputable security auditing firm, preferably one who does nothing other than security. Like evaluating cryptographic primitives, evaluating cryptographers is hard for folks who don't spend all their time paying attention to who's doing what kind of work.
If you do have to do this selection directly, look for folks with solid track records as both publishing academics and system implementers. Any working cryptographer who isn't inside an intelligence service will be publishing in academic journals. However, many purely academic cryptographers don't have the implementation experience required to write correct, production-grade primitive implementations or to evaluate the use of a primitive in context. Cryptography is a very small field; if you're not sure about the person you're thinking of hiring, ask for references and ask around.
As you begin the architecture design process, you'll also be beginning the next phase of threat modeling, the architecture model. In this level of the threat model we enumerate all of the components of the system in sufficient granularity to capture all of the trust boundaries in the system and map their connections. Next, we look at all the actions the system is intended to support at the requirements level and see how those actions flow through the system. We also model all of the supporting actions required to implement those requirements-level events, like login flows. With this model, we then look at how each step of each action could fail and whether that failure could compromise any of the security objectives of the system or any of the security properties it attempts to maintain. If flaws are found, mitigations are added and documented or the architecture is adjusted.
This detailed architectural threat model has a number of benefits. First, it ensures the architecture of the system is documented (a common failure mode). It also ensures that the location of enforcement for every rule in the system (such as access control or resource limitation) is documented, understood, and agreed upon. A proper threat model demonstrates concretely that, if implemented correctly, the system as-architected can meet its security objectives.
With agile processes an architectural model is still constructed and fully fleshed out for the initial architectural concept before any code is written. In many cases, the start of development will, in addition to the selection of frameworks, include a lump of development work to get the system to a point where it has enough of a coherent whole that further features can be seen as discrete additions. While this initial work may be understood as a set of sprints or user stories, it's also often planned as a unit. It's this unit that should undergo architectural threat modeling. Once the initial model is constructed, each additional user story or feature addition can then be seen as a corresponding addition to the threat model.
Once the (initial) architectural threat model is complete, development can move forward, as the team can now understand the security requirements for each module in the system.
If you're not in a position to build a development team in-house with the experience required to deliver secure applications, you'll need to bring in an external team. Many organizations may be familiar with working with development teams in general, but less familiar with the specific demands of higher-security applications. Development, like design, has different specializations, but the security specializations for development are more common. Who you need and the size of the team will be driven by the scope of the application and its complexity. When hiring a development team, while the usual markers around budget and delivery ability matter, you're looking for a few specific things when it comes to security.
Any external team you bring in should be able to explain things like how they'll handle threat modeling, security architecture development, security standards and frameworks selection, and testing. Any team that doesn't have solid answers to questions like these should not be considered. Having them talk you through the kinds of security vulnerabilities they'd expect to see in a system like yours and how they'd mitigate those vulnerabilities may be useful. It's also worth asking for samples of previous work and indications of the kinds of security concerns they've dealt with in that work. If they're able to share them, asking to see audit reports from reputable security auditors for previous work may prove useful.
Applications aimed to help high-risk and specifically-targeted users have different security considerations than your run-of-the-mill enterprise IT tool or consumer application. What might be a small privacy concern elsewhere can be a complete showstopper. Development organizations that have not previously built applications for high-risk users should not be selected, nor, as a rule, should advertising or media agencies with no experience in the field.
At the end of the day, it's difficult to package up the kind of evaluation framework used to judge if a development team will be capable of producing highly-secure applications above and beyond their prior work without also passing on the knowledge needed to review code and application architectures for security. If you're in the position of needing to evaluate whether a development team is likely to be able to deliver sufficiently secure applications, I'm happy to come in as a consultant to help. If you have a security consultant who specializes in application security who you've worked with before, they should also be able to help.
Selecting secure frameworks is complicated, for the same reason that selecting secure protocols is. Few frameworks have chosen to start development from the perspective of trying to provably eliminate categories of low-level bugs. Many other things also inflect the choice of frameworks, including compatibility with other parts of the system, whether they're actively maintained, developer productivity, interactions with other libraries or frameworks, and how familiar a team is with them. Unlike protocols, you may be using a number of different frameworks together in the same context. The interactions between frameworks can create new security vulnerabilities neither framework has alone.
Choosing to use a framework that does not eliminate all classes of low-level bug to which it's vulnerable add security requirements to your development process. If you know your frameworks can leave you exposed, you have to build the fix yourself, either in code or in standards and processes. It is an absolute requirement of writing secure code to fully understand all the classes of problems that your choices of platform, language, and frameworks make you potentially vulnerable to and to ensure you have a strategy for dealing with each of them.
Ideally, you'll fix each issue class generically at the code level by writing library that you use the framework through, removing the possibility of the problem. If you do this and do it correctly (you'll need to both test this code carefully and have outside experts review it), then all you need to do is make sure is your library is used correctly everywhere it needs to be and that no one uses the framework directly. This isn't as good as having the fix integrated into the framework, but at least it means there's one simple thing to remember to do instead of a bunch of potentially complex things. However, writing code like this can be hard and time-consuming and there are times when teams must make tradeoffs. If you can't make it easy for your developers, you'll have to manage the risk via coding standards.
Every team and every individual developer has coding standards. The standards may or may not be written down, and if the team doesn't have consistent habits, they may not follow their own standards, but they're there. For teams working together, it's important to align standards for how things are done for many reasons, including team members being able to read each other's code quickly and reliably. When trying to write secure code, standards become even more important, both for initial development and for correctly understanding the security implications of parts of the system during maintenance and refactoring.
While the set of standards required for a development team will vary by the problem space, language, frameworks, methodologies, and processes used, some security considerations remain the same. It's important to have and document a standard for how you're handling every class of security issue that your environment and frameworks leave unhandled. That means that not only should you have general documentation for the solution, but there should be a standard way that instances of the solution are called out in comments. This is especially important for more complex solutions so reviewers and others can identify what's going on and check for correctness. Similarly, all per-module requirements (see part three) that your threat model and similar artifacts help you generate should be documented.
Standards act as reminders as well as agreements. If, for instance, you have a piece of code that needs to be called on every page of a site to check for access control and for some reason you can't automate this at the framework level, this should be called out in the project standards. Standards don't have to be heavyweight — they can be as simple as a checklist with a half-dozen items to remember for each new module if that's all you need.
Subtle security issues can also happen from things like unclear or inconsistent variable naming resulting in bugs where the wrong values are used in the wrong places. Standards can't fix this completely, of course, but they can both help and make these issues easier to catch. How thorough your standards and policies for development need to be will depend on the size of your team. Faster-paced development and larger teams require more emphasis on standards to keep the process manageable.
When you select a framework or a set of libraries to use in your system, you're adding dependencies between your system and that external codebase. Not only that, but the libraries you use probably have dependencies on other libraries, too, and in some cases they may have just copied an old version of that library's code into their source tree so you won't be able to tell easily. It's critical that you have a list of all the third-party code in your system, both what you're using directly and what's included indirectly, and that you know the security state of all of it. If there are open vulnerabilities in any of this code you need to make sure you've fully mitigated them, and you need to make sure you patch when patches are available. If code you use depends on old versions of libraries with known bugs, you need to replace those dependencies or find another solution. In general, you want to be using the newest sufficiently-stable versions of your dependencies, even if there aren't known vulnerabilities. There may be security-relevant features (like upgrades to SSL handling) that are missing in older versions.
Most libraries should have at least a bug tracker and hopefully also an announcement list — ideally one that's focused just on security. You need to have someone reading those lists and watching those bug trackers for as long as you're maintaining and using the system you build. At any moment a vulnerability could be announced in one of your dependencies that puts your users at risk, and you need to be ready to patch or mitigate in a timely manner. We'll see more on this in part four in “Incident Response” and “Long-Term Maintenance”.
It's worth knowing if your third-party dependencies have been audited. If they have, you should ask to see the audit reports and look at the kind of bugs that have been found and what the response was. If a number of bugs of one type have been found and no concerted effort has been made by the library's development team to eliminate others like them, there may be more. If you're going to get your system audited, you'll need to choose how much third-party code is included in that audit. Knowing which libraries have already been looked at can help with that decision. If you do audit third-party code, make sure you and your security team know how to privately file security bugs with the upstream development team. It may also be worth notifying them you're doing the audit so they can prepare development resources to fix bugs or to integrate your patches if you're planning on developing mitigations yourself.
Just like libraries, third party services and hosting services you interact with also add dependencies to your system. Unlike libraries, you have little control over what these dependencies do. While a good hosting provider will keep their systems as secure possible and won't do anything without telling you, they may still be compelled by law enforcement to take down your system, to reveal any data they can, or even potentially to try to modify the systems your code is running on. Even excluding law enforcement, providers can be compromised or can decide to take potentially harmful actions for political or commercial reasons.
In more traditional enterprise or consumer environments, reliance on third-party services is common and increasing. In higher-risk environments, centralization and third-party services can be extremely dangerous. Of particular note are user behavior-tracking systems. Many designers and developers are used to relying on third-party tracking tools to understand how people are using their system. Many of these tools attempt to identify users, leak their data and behavior, and sell that data to third parties. Including these tools in systems intended to maintain user privacy is directly counterproductive. The behavior of all third-party systems and the particular risks they entail should be included in all threat models.
Decentralization and reducing dependencies on third-party services and hosting providers can effectively reduce the exposure of the system. Centralized systems always rely on some level of security by policy, which is only effective until policy changes. Replacing that with security by design allows you to make stronger statements about system behavior over time.
If you liked this essay, you can sponsor me writing more. I've started a Patreon where you can pledge to support each essay I write. I'm hoping to put out one or two a month, and if I can reach my goal of having a day of writing work funded for every essay, it will make it much easier for me to find the time. In my queue right now are a the next two pieces of this series, more updates to my piece on real world use cases for high-risk users, and a multi-part series on deniability and security invariants that's been in the works for some time. I'd much rather do work that helps the community than concentrate on narrow commercial work that never sees the light of day, and you can help me do just that.