Cryptography

Encryption and Its Role in Election Security

The security of the United States’ electoral process has come into question from a number of different angles. On the one hand, the increase in mail-in voting due to COVID-19 has caused some politicians to call the validity of the election into question. This is largely based upon unsupported allegations of widespread fraudulent use of absentee ballots.

On the other hand, the electronic voting machines that are in widespread usage and are the predominant method by which US voters cast their votes have a number of known security issues. Assessments of the security of over 100 voting machines at the 2019 DEFCON conference found that all of them contained exploitable vulnerabilities.

Applying Cryptography to Election Security Threats

The security of the US election infrastructure is a vital concern, and cryptographic algorithms have a number of potential applications for voting machine security. The CIA Triad (confidentiality, integrity, and authenticity) outline the primary goals of cryptographic algorithms, and all of these apply to election infrastructure.

Confidentiality: Revealed Votes

One of the key components of a fair election process is the ability for voters to cast their ballots in secret. If the individual votes within an election are not confidential, then the potential exists for coercion or blackmail to impact the results of the election.

Ensuring data confidentiality is the primary goal of an encryption algorithm. Homomorphic encryption algorithms enable algebraic operations to be performed on encrypted data without revealing the data itself. This can be applied to election infrastructure by enabling ballots to be encrypted and tallied without revealing their actual contents.

Integrity: Voting Machine Glitches

Voting machines are computers, and computers have the potential for software bugs or glitches that impact their operation. This is not even a theoretical threat to the integrity of the electronic voting systems in use today.

In a 2019 Pennsylvania election, a voter reported that a touchscreen voting system was acting oddly and not properly recording her votes. Analysis of the paper trail associated with this machine determined that a candidate for whom the machine recorded a total of 15 votes actually won the election by over 1,000. Despite this potential issue with the integrity of electronic storage, 12% of voters use electronic voting machines with no paper backup.

Cryptographic tools such as hash algorithms, digital signatures, and message authentication codes (MACs) are designed to protect the integrity of data by alerting if any modifications have been performed. The use of one of these solutions (potentially stored separately from votes or provided on a receipt to voters) could help to detect issues that affect the accuracy of the vote.

Authenticity: Modified Votes

The most significant threat to the integrity of the US electoral process is the potential for votes to be modified in a way that impacts the result of the election. The potential for this to occur has been demonstrated in numerous ways, including a study that found that 94% of voters didn’t notice that their votes had been changed between a touchscreen system and the associated paper record.

Authenticating the identity of a message sender is one of the primary goals of digital signature algorithms. Voters issued identification cards with digital certificates stored on them could generate a digital signature as part of the voting process. If their ballot or vote was modified in any way, the signature would not be valid, making the tampering easily detectable.

Securing US Elections with Cryptography

That the US elections infrastructure has security issues is unquestionable. Numerous assessments have demonstrated that voting machines violate fundamental cybersecurity best practices. However, US law limits security researchers’ ability to perform testing of voting machines, and some voting machine manufacturers are pushing to make tests that they have not authorized (and where they cannot control the distribution of the results) illegal.

Despite these issues, options exist for improving the security of US elections. Maybe one day, the US will use modern encryption solutions to bolster election security.

The Problem with Cloud Storage for Secure File Sharing

In the modern business, the ability to quickly and easily share documents and other files throughout the business is essential. Everyone works in teams, and sharing documents via email or shared file servers is inefficient.

Cloud-based document sharing services, like Dropbox and Google Drive, offer a tempting alternative to traditional methods of document sharing. Tools like Google Docs enable an entire team to edit a document in parallel and track the complete revision history of the document, making it easy to attribute and revert edits.

However, these cloud-based document sharing services also have their downsides. Employees using these services have to make a choice between efficiency and security, and many choose efficiency. As a result, a number of organizations have suffered data breaches caused by employee negligence in configuring and securing cloud-based services.

Security Challenges of Cloud-Based File Sharing

Cloud Service Providers (CSPs) provide built-in security configuration settings for their environments. While the details vary from CSP to CSP, many of them operate on a simple private/public access model.

A private cloud, as the name suggests, is private. In order to access the cloud-based resource, an employee needs to be explicitly invited to access the resource. On Google Drive, for example, this invitation comes in the form of the document owner (or other administrator) sending a sharing link to the person’s email address.

While this system is effective at securing access to the cloud-based resource, it also creates significant overhead for the document administrators. They must explicitly invite every user of the cloud-based resource and manually revoke permissions if access is abused. While Google Docs keeps an edit history, making it possible to detect such abuse, the document administrator would have to manually review this for anything suspicious.

The overhead associated with properly securing cloud-based resources drives many users to go to the opposite extreme. By marking the cloud-based resource as public, the employee can share access to the document simply by sharing the URL of the document with the desired recipient.

The primary benefit of this system is that anyone with access to the link has access to the resource, making it easy to invite new users. The primary downside of this system is that anyone with access to the link has access to the resource, making it easy for unauthorized users to discover and access the document.

Many people incorrectly believe that it is difficult to find the URL of a cloud-based resource if you are not an authorized user of the resource. Even ignoring the possible cases where an authorized user forwards the link to an unauthorized use, cloud URLs are relatively easy to discover. Hacking tools exist specifically for scanning the space of possible cloud URLs (they have a set scheme), checking if a given URL is valid, and checking if it is public. In fact, most known cloud data breaches were discovered in this way. An ethical hacker using these tools identified an unsecured cloud resource and notified the owner.

Beyond the access control issues associated with cloud resources set to “public,” there are also attribution issues. For example, Google Drive maintains a complete edit history for a document, making it possible to determine if a user has made unauthorized edits. However, knowing that “Anonymous Panda” was the one at fault doesn’t help much. Additionally, Google Drive doesn’t track access by anonymous users, so only those trying to modify the data (instead of just stealing it) would be detected.

Secure Document Sharing with GhostVolt

Cloud-based document storage, like Google Drive, has made significant strides toward making it possible to efficiently and effectively share documents within a team. However, these systems also have a ways to go.

However, more effective solutions for secure document sharing are available. GhostVolt Business takes the basic services that Google Drive (and similar services) provide and takes them a lot further.

Encryption of all files by default, whether on a user’s personal machine or in the cloud, with AES-256 ensures the security of business data. Access to these documents can be managed by defining specific user roles and managing permissions to files based off of these roles. This makes it easy to map a user’s access to documents within the organization to their job responsibilities.

However, where GhostVolt really stands out is in the visibility that it provides regarding shared documents. All user activity is logged, and GhostVolt has built-in reporting functionality to summarize raw data into readable reports. This, combined with Ghostvolt’s strong access controls, make it easy to maintain and demonstrate compliance with a wide (and growing) range of data protection regulations, such as the EU’s General Data Protection Regulation (GDPR), the Payment Card Industry Data Security Standard (PCI DSS), and HIPAA, SOX, CCPA, and more in the US.

The Importance of Secure Document Sharing

As more regulations like GDPR and CCPA come into effect, organizations are required to strongly protect the data in their possession and to be able to demonstrate that these security controls are in place. On the other hand, the ability to quickly share data throughout the organization is essential to enabling the organization to operate efficiently.

A secure document sharing solution, with built-in encryption and strong role-based access control, is essential to maintaining regulatory compliance. However, it also needs to be intuitive and efficient to use to meet core business needs. When choosing a document sharing solution, an organization should not need to compromise on security, usability, and performance.

Inside the Encryption Backdoor Debate

Encryption technology is designed to protect sensitive data from unauthorized access.  Communications applications with end-to-end encryption, like Telegram or Signal, are designed to ensure that an eavesdropper who intercepts the traffic en route doesn’t have the capability to read it.

Increasingly, government officials have been calling for backdoors to be built into all encryption algorithms.  The most famous and vocal proponent of it is the US Attorney General, William Barr; however, a meeting of representatives from Five Eyes countries (Australia, Canada, New Zealand, UK, and US) recently expressed support for the proposal

Despite the security concerns associated with it, encryption backdoors are required in some countries.  In December 2018, Australia ignored the advice of security and encryption experts and passed a law mandating backdoors in all encryption.

The Call for Backdoors

The argument for encryption backdoors essentially boils down to one of law enforcement.  Encryption is designed to keep all eavesdroppers out of a private conversation, including the police.  Encryption backdoor supporters, like Barr, believe that this ability to “go dark” has been causing an increase in unsolved criminal investigations.

As a result, enterprises may be required to install backdoors in encryption algorithm, regardless of the damage to personal privacy.  Barr has expressed the opinion that the general public doesn’t really need strong encryption since they’re only protecting personal emails and selfies and not the nuclear launch codes.

The problem with Barr’s justification for breaking encryption is that crime rates aren’t rising.  In fact, they’ve been steadily dropping since 1993.  While encryption denies law enforcement access to some forms of evidence, they still have all of the investigative techniques that existed before the rise of computers, and these techniques have consistently proven to be effective.

Backdoored Cryptography: Data Encryption Standard

This isn’t the first time that the US government has tried to weaken encryption for the purposes of law enforcement and/or national security.  In the past, the National Security Agency (NSA) was even successful in weakening one encryption algorithm: the Data Encryption Standard.

In 1973 and 1974, the National Board of Standards, the forerunner of the current National Institute of Standards and Technology (NIST), issued calls for an encryption algorithm that would become the official Data Encryption Standard of the US government.  One submission, developed by IBM, was eventually selected as the winner of the contest.

As part of the review process, the NSA had the ability to review and make suggestions regarding the encryption algorithm.  In the end, they made two major modifications that changed the substitution boxes (S-Boxes) used by the cipher and the key length.

The NSA’s change to DES’s S-Boxes was actually intended to improve the security of the cipher.  At the time, only the NSA knew about a particular method of cryptanalysis, called linear cryptanalysis.  The original S-Boxes of the proposed algorithm were vulnerable to this attack, and the NSA’s modifications corrected this.

The other main change that the NSA made to the cipher was a change to the length of the secret key from 64 to 56 bits (though the NSA wanted 48-bit keys).  This made the encryption keys 256 times weaker than the original cipher and led to its being broken in 1999. The original DES submission would have remained secure for a longer time.

The Challenge of “Secure” Backdoors

The main challenge with backdoors in encryption algorithms is that no-one knows how to do them in a secure way.  In his calls for encryption backdoors, Barr pointed to three previous proposals for accomplishing it:

  1. Allowing law enforcement to silently enter as a participant in a conversation

  2. A hardware chip in phones that stores encryption keys that only law enforcement can access

  3. A multi-layer encryption algorithm that allows law enforcement access to the underlying data

These proposals are nothing new.  The first option has been tried in the past with unfortunate results since it is difficult to ensure that only law enforcement has access to a backdoor.  The problem with the other two options is that they seem like a good idea, but no-one knows how to implement them in a secure fashion.

The Bottom Line

The encryption backdoor debate is troubling since it involves lawmakers deliberately ignoring statements from security and cryptography experts that their requests are impossible to implement in a secure fashion.  If the backdoor proponents succeed, individual privacy could take a major hit.

And the crazy thing is that it won’t even work.  The primary difference between apps like Whatsapp and Telegram and messaging apps like Signal is that Whatsapp and Telegram are operated and maintained by companies and Signal is open-source.

This means that, if companies like Facebook and Telegram comply with the law, all of their users will have potential privacy violations.  However, the criminals that the law is designed to address (and who couldn’t care less about the law) will switch to Signal or other open-source apps where the backdoor can be removed from the code before using it.

The encryption backdoor law boils down to limiting people (and criminals) to only using certain apps on their phones and computers.  The number of jailbroken iPhones, rooted Androids, and unlicensed copies of Windows in use demonstrate that this is another problem that device manufacturers don’t know how to solve.

The Story of Cryptography Part 3: Modern Cryptography

This is the final post in a three-part series on the history of cryptography.  In the first post, we explored encryption algorithms that pioneered some of the core components of encryption: Caesar’s Box and the Vigenere Cipher.  From there, the second post describes some of the historical cryptographic milestones of the 20th century: the Enigma machine, the Data Encryption Standard (DES), and the invention of asymmetric encryption.

In this post, we’ll explore how cryptography has evolved in the 21st century.  Beginning with DES’s replacement, AES, this post explores the use of Elliptic Curve Cryptography (ECC) and some of the areas of cryptographic research currently being explored today: homomorphic encryption and post-quantum cryptography.

The Advanced Encryption Standard

The Data Encryption Standard (DES) was the first encryption algorithm publicly endorsed by the US government.  As a result, it was rapidly adopted in the US and beyond and also subjected to a great deal of scrutiny by cryptographers.  When the DES was defeated in 1997 by a brute-force key guessing attack, it was time for an update.

In 1997, NIST put out a call for proposals for the Advanced Encryption Standard (AES).  This resulted in the submission of fifteen candidate ciphers from around the world and a three-year selection process where cryptographers debated and attempted to break the various ciphers.

In the end, three ciphers from the Rijndael family of ciphers (developed by Belgian cryptographers) were selected as the AES.

The three variants have a 128-bit, 192-bit, or 256-bit secret key.  All three variants use a 128-bit block size (organized into 4 4-byte rows) and multiple rounds.  Each round but the last includes four steps:

  • SubBytes: Application of S-Boxes to transform the state

  • ShiftRows: Shifting of the last three rows by 1, 2, or 3 bytes

  • MixColumns: Combination of the four bytes in each column of the state

  • AddRoundKey: Exclusive-or (XOR) of the state with the round key

Before the rounds begin, an AddRoundKey operation is performed, and the last round does not include MixColumns.  One difference between the different variants is the number of rounds: 10, 12, or 14. The other significant difference is the key schedule used to derive the 128-bit round keys from the 128-bit, 192-bit, or 256-bit secret key.

Security of the AES

AES is currently considered a secure cipher as the only feasible attacks against the full version of AES uses side-channel analysis, where power consumption, electro-magnetic emissions, etc. are measured and used to infer internal values of the state of the cipher.  Other attacks against AES either operate on a reduced number of rounds or have a limited effect on the security.

The best known attack reduces the effective key lengths of the three variants to 126, 189.9, and 254.3 bits.  All of these key lengths are infeasible to brute-force on modern computing hardware.

Elliptic Curve Cryptography

In the previous post in this series, we discussed the invention of asymmetric encryption and how algorithms using it are still considered secure.  However, the security of these algorithms is based upon the key length used. As computing improves, brute force attacks on longer keys become feasible, forcing a move to even larger secret keys.

Elliptic curve cryptography (ECC) provides a limited solution to this problem.  Elliptic curves are mathematical functions with a specific structure with operations that map to operations over the integers.  Addition of two points on an elliptic curve is equivalent to multiplication of two integers, and exponentiation in the integers is equivalent to multiplication of points on an elliptic curve.  As a result, it is possible to construct the same “hard” mathematical problems used in asymmetric cryptography using elliptic curves.

The benefit of using elliptic curve cryptography instead of traditional integer-based algorithms like RSA and Diffie-Hellman is an increased level of security with shorter key lengths.  According to NIST, a 256-bit ECC private key provides equivalent security to a 3072-bit RSA key. ECC-based asymmetric algorithms also consume less energy than their integer-based counterparts, making them more efficient and usable as well.

However, the security advantages of ECC only apply to brute force attacks exploiting key lengths.  The ECC algorithms use the same underlying mathematical “hard” problems as RSA and Diffie-Hellman, meaning that an attacker who finds an “easy” solution to these “hard” problems can attack ECC-based encryption as well.

Next-Gen Cryptography

The Advanced Encryption Standard (AES) and Elliptic Curve Cryptography are in common use today, but they’re largely “solved” problems.  Some of the cryptographic algorithms currently in development today, like homomorphic encryption and post-quantum cryptography, are designed to expand the applications of cryptography or solve problems created by improvements in computing systems.

Homomorphic Encryption

Modern cryptography works very well at protecting sensitive data against attack with modern systems. However, it has one very significant shortcoming: you can’t process encrypted data.  Adding or multiplying the ciphertexts of two AES-encrypted values doesn’t produce the sum or product of their corresponding plaintexts. This means that data must be decrypted in order to be processed, and encryption only protects data at rest and data in transit.

Homomorphic encryption is designed to change this.  A fully homomorphic encryption algorithm allows the user to perform arbitrary computations on a ciphertext and then decrypt the result to the correct plaintext (with all computations applied).

The first fully homomorphic encryption algorithm was proposed by Craig Gentry in his 2009 PhD thesis.  Since then, the first has been actively expanded by Gentry and others to help improve the efficiency, security, and usability of FHE algorithms.

Post-Quantum Cryptography

Another area that is getting a lot of attention in the cryptography space is that of post-quantum cryptography.  In previous posts, we discussed how asymmetric encryption is based upon mathematical “hard” problems. With these problems, legitimate operations (multiplication and exponentiation) are polynomially difficult, but their inverses (factoring and logarithms) have a difficulty that grows exponentially with the length of the numbers used.

This asymmetry allows asymmetric cryptosystems to be developed that are usable but can achieve an arbitrary level of security against attack.  However, this only works if these problems remain “hard”, and an algorithm developed by Peter Shor in 1994 makes it possible for quantum computers to break this assumption.  When quantum computers grow powerful enough, traditional asymmetric encryption will be broken.

In anticipation of this day, cryptographer are actively developing post-quantum asymmetric cryptography algorithms.  These algorithms are based on mathematical problems that are “hard” for quantum computers as well. Several of these problems and algorithms exist, but there is still active research in the field to develop new ones and test existing ones to ensure that secure alternatives are available when traditional asymmetric cryptography breaks.

Wrapping Up

The goal of this three part series is to provide an introduction to the history of cryptography.  Beginning with historical ciphers, moving to 20th century encryption systems, and concluding with modern cryptographic algorithms, this series highlights the inventions with the greatest impact on the cryptographic algorithms that we use today.

The Story of Cryptography Part 2: 20th Century Cryptography

In the first post of this series, we described the early history of cryptography.  This included the development and use of Caesar’s Box and the Vigenere cipher, encryption algorithms that pioneered the use of encryption and the encryption key.

It wasn’t until the 20th century that cryptography came into widespread use.  Notable encryption milestones from the 20th century include the development and use of the Enigma machine, the creation of the US Data Encryption Standard (DES), and the development of asymmetric encryption.

The Enigma Machine

The Enigma machine is the most famous mechanical encryption machine in existence.  Developed by Arthur Scherbius in 1918, it was a machine that implemented a substitution cipher.  The use of several different rotors and reconfigurable wiring allowed the machine to operate in many different configurations and acted as the secret key.

When used, the Enigma operator would type a letter on the keyboard and the next letter of the ciphertext would light up.  Decryption was accomplished via the same process, allowing the use of Enigma for encrypting radio communications (which was necessary for the rapid troop movement that was a vital part of their strategy).  The Enigma machines were famously used by the Germans in WWII, but versions were also employed by the Japanese and Italians as well.

Cracking the German Code

Breaking the encryption of the Enigma machine was a multistage process.  The model in use by the Germans had cryptographic vulnerabilities, allowing a Polish cryptanalysis, Marian Rejewski, to crack the encryption keys in use.  However, he did not have knowledge of the internal wiring of the machines, so he was not able to use this knowledge to break the codes.

This information was obtained by French spy Hans-Thilo Schmidt, along with the codes used for encryption in September and October 1932.  Using captured ciphertexts from this period, the Polish were able to build a working Enigma machine and decrypt German communications starting the following January.  This began a cat and mouse game between the Germans and the Poles until 1938, when German improvements made decryption too resource-intensive for the Poles to maintain.

In July of the following year, the Poles disclosed their Enigma decryption efforts to French and British military intelligence.  This partnership allowed the British to develop their Enigma decryption efforts at Bletchley Park and continue decryption of German communications for the rest of the war.  In the end, one of the deciding factors in their ability to decrypt Enigma communications was procedural errors like failure to update encryption keys regularly 

Lucifer: The Data Encryption Standard

Before the early 1970s, the use of cryptography was primarily restricted to the military.  However, in 1971, the first encryption algorithms for private use were published. Companies had seen the effectiveness of encryption and were interested in applying to the protection of their intellectual property.

The first set of civilian encryption algorithms were developed by IBM under the name Lucifer.  Several different variants were developed internal to IBM, and one of the variants was submitted to the US National Bureau of standards (the precursor of NIST) in response to a call for proposals for the national Data Encryption Standard.

The submitted version of Lucifer was vulnerable to differential cryptanalysis, an attack only known to the NSA at the time.  However, after some modifications by the NSA, IBM’s Lucifer submission was accepted as the Data Encryption Standard (DES).

The accepted version of DES was a Feistel network, a multi-round encryption algorithm whose structure is shown in the images above, where the right image shows the operations performed in the boxes labeled F (for Feistel function) in the left image.  As shown, the message undergoes a permutation at the beginning and the end of the encryption process (IP and FP) and in each round of the Feistel function (labeled P). The E box in the right image represents an expansion of the half block from 32 to 48 bits, and each of the S-Boxes represents a substitution table that reduces 6 input bits to 4 output bits.  The cipher also has a key schedule, which generates 16 48-bit round keys from the original 56-bit secret key.

Brute-Forcing the Data Encryption Standard

The NSA made several different modifications to the Lucifer cipher before it was accepted as the DES.  Some modifications, like changing the S-Boxes to protect against differential cryptanalysis, were designed to improve the security of the cipher.

Others, like changing the key length from 128 bits to 56 bits, may have been designed to ensure that the NSA still had the ability to break the cipher at need.  In the end, DES wasn’t broken based upon cryptographic vulnerabilities (though some have been identified), but by a brute force attack taking advantage of the limited keyspace.  DES was first broken in 1997, paving the way for the Advanced Encryption Standard.

DH and RSA: Asymmetric Encryption

The original encryption algorithms are all symmetric encryption algorithms.  These systems require both the sender and recipient of a message to both have access to the same encryption key.  This requires that the users have the ability to set up a secure channel to transmit this key before encryption can begin.

However, in the 1970s, the invention of asymmetric cryptography changed this.  A series of encryption algorithms developed by the UK’s GCHQ made use of mathematical one-way functions to implement asymmetric encryption.  These algorithms allowed two parties to create a shared private key over a public channel or let a user generate a private/public keypair where the public key can be used for encryption of messages that can only be read using the private key.

The one way functions used in most asymmetric cryptography are the factorization and discrete logarithm problem.  The security of these schemes relies on the fact that operations like multiplication and exponentiation are “easy” (with polynomial difficulty) and the inverse operations, factoring and logarithms, are “hard” (with exponential difficulty).  Asymmetric encryption algorithms are designed to allow legitimate users to perform “easy” operations while forcing attackers to solve “hard” problems. The asymmetric relationship between the attacker and the defender means that it’s possible to create algorithms that are usable but have an arbitrary level of security against brute-force attacks.

While GCHQ cryptographers originally developed asymmetric cryptography in the early 1970s, their results and algorithms remained classified until 1997.  As a result, the algorithms were independently discovered by private researchers and are named after these parties.

The original key exchange protocol was invented by Malcolm J. Williamson of GCHQ in 1974 and is named after Whitfield Diffie and Martin Hellman, who discovered it in 1976.  The RSA algorithm for asymmetric encryption and decryption was discovered by Clifford Cocks in 1973 and a more general version was invented by Ron Rivest, Adi Shamir, and Leonard Adleman in 1976.  Since then, several different asymmetric encryption algorithms have been invented.

Coming Up: Modern Cryptography

The encryption algorithms presented in this article are largely broken, with the exception of the asymmetric algorithms Diffie-Hellman and RSA.  However, even the security of these algorithms are threatened by advances in computers. The final post in this series describes some of the modern encryption algorithms currently in use or development that are designed to replace these algorithms as they are broken.

The Story of Cryptography Part 1: Historical Cryptography

Cryptography is the science of secrets.  Literally meaning “secret writing”, cryptography is designed to hide information from all but its intended recipients.  Modern cryptography is essential to the secure Internet, corporate cybersecurity, and blockchain technology.

However, cryptography has a very long history before the modern ciphers were invented.  In this three-part series, we’ll explore the history of cryptography before the 20th century, in the 20th century, and in the modern day.

Caesar’s Box

Caesar’s Box is one of the earliest known ciphers.  Developed around 100 BC, it was used by Julius Caesar to send secret messages to his generals in the field.  Since messengers could easily be waylaid by the enemy en route, the use of even a simple cryptographic algorithm to encode his orders and his generals’ responses could give him a significant strategic advantage: he could intercept and read his opponents’ messages but they can’t read his.

Caesar’s box is a particular implementation of a shift cipher (which is a specific type of substitution cipher).  In Caesar’s Box, the encryption algorithm involved shifting each letter in the message three letters to the right to produce the ciphertext.  For example, A became D, B became E, and X became A. Decryption involved reversing this process by shifting each letter of the ciphertext three steps to the left.

In Caesar’s Box, the secret step amount was three, but this isn’t the only option.  Other shift ciphers (like ROT13) use different numbers of steps (like 13). However, all shift ciphers are insecure and can be easily broken.

Frequency Analysis: Breaking Caesar’s Box

It turns out that the hardest part of breaking Caesar’s Box is knowing the language of the message that it encodes.  Once you know that, it’s easy to break the cipher using frequency analysis. If that fails, a brute force attack against the cipher can work, since there are only 26 possible shifts in English.

Frequency analysis attacks take advantage of the fact that all of the letters in the English alphabet are not used equally.  E is the most commonly used letter, and you hardly ever see a word using z. In fact, this is the first time it appears in this article.  The relative frequencies of each letter in the English language are shown in the graph below.

The frequencies of the letters usage in English is important because Caesar’s Box does nothing to change them.  In a ciphertext encrypted with Caesar’s box, the most common letter is likely to be h.

Knowing this, it’s easy to determine the shift factor for Caesar’s Box or any other shift cipher out there.  With this information, the rest of the ciphertext can be easily decrypted. In the event that the shift factor is incorrect (a sentence may have more t’s than e’s for example), there are few enough options for a shift cipher that it’s easy to try them all.

Vigenere’s Cipher

Caesar’s Box may be the first famous cipher in existence, but it’s missing something rather important for modern ciphers: a secret key.  While the number of steps to shift may be considered the secret key for a general shift cipher, this value is hardcoded as three for Caesar’s Box.  As a result, the security of Caesar’s Box relies on the fact that no-one knows how it works, a practice called “security through obscurity” that violates Kerckhoff’s Law.

Vigenere’s cipher was created in the 16th century and introduced the concept of a secret key.  In Vigenere’s cipher, the secret key is another word or phrase that may be shorter than the plaintext to be encrypted.  If this is the case, the key is repeated until it matches the length of the plaintext.

To encrypt using Vigenere’s cipher, you convert the letters in the plaintext and the key to numbers in the range 0-25, add each pair of numbers together, and calculate the result modulus 26 (the result of dividing by 26 and keeping the remainder).  The output of this calculation is mapped back to a letter and used as a character of the ciphertext.

The image above shows a lookup table for performing encryption with the Vigenere cipher.  The columns are the letters of the secret key and the rows are the letters of the corresponding letter of plaintext.  Their intersection is the letter of ciphertext created from a given pair of plaintext and key letters.

Cryptanalysis of Vigenere’s Cipher

Like Caesar’s box, decryption of the Vigenere cipher can be performed using frequency analysis.  However, it takes a little more work. In order to decrypt Vigenere’s cipher, it’s necessary to first determine the period of the cipher and then apply frequency analysis.

The period of the Vigenere cipher is the length of the secret key used for encryption.  For example, encryption of a 36 letter plaintext with the key CIPHER would actually use a key of CIPHERCIPHERCIPHERCIPHERCIPHERCIPHER.

Once you know the period of the cipher, it’s possible to decrypt it just like a Caesar cipher.  Note from the image above that each particular letter of the key just creates a shift of a certain amount.  If you know the letters of the ciphertext that used the same shift value, you can apply frequency analysis to the cipher.

This is why the period of the cipher is important.  Note that the first, seventh, etc. letters of the sample key are all C (a shift of 2).  Through either guess and check or using calculation of the Index of Coincident, it’s possible to determine the actual period, shift values, and plaintext of the cipher.

Coming Up: 20th Century Ciphers

Caesar’s Box and the Vigenere cipher are two of the earliest known ciphers.  They pioneered the use of encryption to protect sensitive communications data and the use of a secret key in encryption.  In the next post in this series, we’ll move forward to the 20th century. There, we’ll see how cryptography evolved when driven both by military interests and organizations protecting their intellectual property.

Introduction to Blockchain: Fundamentals

This article is the first in a multi-part series describing how blockchain works and some of the security assumptions associated with it. Each article will describe a different level of how the blockchain’s distributed ledger operates, starting with the fundamentals.

At the most basic level, blockchain technology is composed of cryptographic algorithms. The creator of blockchain, Satoshi Nakamoto, developed a system in which the trust that we traditionally place in organizations to maintain trusted records (like banks) is transferred to the blockchain and the cryptographic algorithms that it uses.

The Cryptography Behind Blockchain

The goal of the blockchain is to create a distributed, decentralized, and trusted record of the history of the system. The most famous blockchain, Bitcoin, uses this record to store the history of transactions, so people can make and receive payments on the Bitcoin blockchain and trust that their money won’t be lost or stolen.

In order to achieve this level of trust, the blockchain uses a couple of cryptographic algorithms as building blocks. Hash functions and public key cryptography are crucial to both the functionality and security of the blockchain ecosystem.

Hash Functions

A hash function is a mathematical function that can take any number as an input and produces an output in a fixed range of numbers. For example, 256-bit hash functions (which are commonly used in blockchain), produce outputs in the range 0-2256.

In order to be considered secure, a hash function needs to be collision-resistant, this means that it’s extremely difficult (to the point of being nearly impossible) to find two inputs that create the same hash output. Accomplishing this requires a few different features:

  • No weaknesses in the hash function

  • A large number of possible outputs

  • A one-way hash function (can’t derive the input from the output)

  • Similar inputs produce very different outputs

If a hash function meets these requirements, it can be used in blockchain. However, if any of these requirements are violated, then the security of the blockchain is at risk. Blockchain relies heavily on secure hash functions to ensure that transactions cannot be modified after being stored in the ledger.

Public Key Cryptography

The other cryptographic algorithm used in blockchain technology is public key cryptography. This type of cryptography is also widely used on the Internet as well since it has so many useful properties.  With public key cryptography, you can:

  • Encrypt a message so that only the intended recipient can read it

  • Generate a digital signature proving that you sent a given message

  • Use a digital signature to verify that a message was not modified in transit

In public key cryptography, everyone has two different encryption keys: a private one and a public one. Your private key is a random number that you generate and keep secret. It is used for decrypting messages and generating digital signatures.

Your public key is derived from your private key and, as the name suggests, is designed to be publicly distributed. It’s used for encrypting messages to you and generating digital signatures. Your address (where people sent transactions to) on the blockchain is typically derived from your public key.

The security of public key cryptography is based on two things. The first is the secrecy of your private key. If someone can guess or steal your private key, they have complete control of your account on the blockchain. This allows them to perform transactions on your behalf and decrypt data meant for you. The most common way that blockchain is “hacked” is people failing to protect their private key.

The other main assumption of public key cryptography is that the algorithms used are secure. Public key cryptography is based off of mathematical “hard” problems, where performing an operation is much easier than reversing it. For example, it’s relatively easy to multiply two numbers together but hard to factor the result. Similarly, it’s easy to perform exponentiation but hard to calculate logarithms. As a result, it’s possible to create schemes where computers are capable of performing the easy operation but not the hard one.

The security of these “hard” problems are why you’ll often see articles about quantum computers breaking blockchain. Due to how quantum computers work, factoring and logarithms aren’t much harder than multiplication and exponentiation, so traditional public key cryptography no longer works. However, other problems exist that are still “hard” for quantum computers, so the threat of quantum computers to blockchain can be fixed with a simple upgrade. 

How Blockchains are Put Together

As its name suggests, the blockchain is a collection of blocks that are chained together to create a continuous whole. In this section, we explore how this works.

Blocks

The purpose of the blockchain is to act as a distributed ledger that stores data in a secure fashion. The blocks are the place where this data is stored.

blocks.png

The image above illustrates the basic structure of a block in a blockchain. We’ll talk about every part of this image throughout the series, but for now focus on the green sections. Each green piece represents a transaction within the block. While a transaction may represent a literal transaction (i.e. a transfer of value) on blockchains like Bitcoin, this is not the only option. As we’ll see later, smart contract platforms store other things (like computer code) as transactions as well.

The security of the blocks in the digital ledger depends on the security of public key cryptography. Every transaction and block in the blockchain is digitally signed by its creator. This allows anyone with access to the blockchain to easy validate that every transaction is authenticated (i.e. sent by someone who owns the associated account) and has not been modified since creation. The integrity and authenticity of the blocks in the chain is also assured by the digital signature of the block creator.

Chaining

Each block is equivalent to a single page in a bank’s account ledger; it only represents a slice of the history of the network’s history. In order to combine these slides into a continuous whole, the blockchain makes use of hash functions.

In the image above, you can see the hash functions linking each block together. Each block contains the hash of the previous block as part of its block header (the section not containing transaction data).

The fact that every block is dependent on the previous one is significant because of the collision-resistance of hash functions. If someone wanted to forge block 51 in the image, they have two options: find another version of block 51 that has the same hash or forge every block after 52 as well. The first is supposed to be impossible (due to collision resistance) and the other should be difficult or impossible since the blockchain is designed to make forging even a single block difficult (more on this later).

The security of the “chain” part of blockchain is based upon the collision resistance of the hash function that it uses. If someone can find a way to generate another version of block 51 that has the same hash, the immutability assumptions of blockchain break down and you can’t trust that any transaction will remain in the distributed ledger.

What's Next...

At this point, we’ve covered the fundamental cryptographic algorithms and data structures used in blockchain. The next two articles in this series are focused on the infrastructure behind the blockchain: the nodes and how they’re networked together.