From: ritter@io.com (Terry Ritter)
Newsgroups: sci.crypt
Subject: The Cloak2 Design
Date: 6 Oct 1994 00:38:47 -0500

 The Cloak2 Cipher Design -- A Dynamic Substitution Stream Cipher

      Terry Ritter


 The original Cloak design, circa 1990, was intended to demonstrate
 the practical use of Dynamic Substitution, and to highlight the
 difference between a serious cipher and the many "toy" ciphers
 available at the time.

 Cloak2 is intended to be an improved design with added speed and
 especially powerful key-management, which is the heart of a modern
 cipher.  But, fundamentally, Cloak2 remains a secret-key stream
 cipher with large internal and message keys.


 STREAM CIPHER

 The Cloak2 design is based on stream cipher technology, which seems
 substantially less controversial than that for block ciphers.  This
 also allows Cloak2 to avoid the minor problems involved when
 variable-length data are collected in fixed-length blocks.


 BINARY ORIENTATION

 The Cloak2 design is a binary-oriented cipher in the sense that
 any plaintext file is always translated into binary ciphertext;
 there is no "network ASCII" mode.

 The Cloak2 re-origination unit is the "block," a simple data
 structure consisting of a 32-bit length value, and the ciphertext
 data.  (The length value in a one-block file is the same as the
 overall file length.)  A single file can contain multiple blocks,
 which can be separately deciphered into a common result file.
 This peculiar capability allows keys to be added to an alias file
 without deciphering the alias file, and also allows ciphertext
 under the same key to simply be collected in archives.


 THE DESIGN

 Like most stream ciphers, the Cloak2 design consists of a random
 number generator (RNG) subsystem, and a reversible data-plus-
 confusion combining subsystem.  One intent of the original design
 was to show that a cipher with a huge amount of internal state is
 indeed a practical design.  As computers have become faster, the
 design has become even more practical.


 The Main RNG

 The Cloak2 RNG subsystem produces 32-bit values using an Additive
 RNG of degree 9689 with an internal state of 310,048 bits (9689 x
 32) and a primitive trinomial.  This output then goes into a
 standard Jitterizer [3:111] which deletes about 10% of the sequence,
 and provides a changing exclusive-OR offset value for each "take-
 group" in the sequence.

 The nearly 38K of state needed to run the main RNG is developed
 from a 124-byte original value by using smaller Additive RNG's to
 fill the state for larger RNG's.  The original value is sufficient
 to fill a 32-bit-wide deg-31 Additive RNG, which fills a deg-127
 RNG, which fills a deg-607 RNG, which fills a deg-3217 RNG, which
 fills a deg-9689 RNG.  In every case, the RNG fills separate store,
 and RNG output is isolated and confused by a "jitterizer" mechanism
 which deletes part of the linear sequence and changes the rest.
 Thus, the original 992-bit value is nonlinearly altered four times
 as it is expanded.  A similar approach (with different polynomials
 and ultimate size) is the basis for all the other RNG's in the
 Cloak2 design.


 The Jitterizer

 The Jitterizer is basically a simple mechanism which is intended
 to delete values from a linear sequence, and to confuse the values
 which are accepted.  First, an RNG value is collected, in which
 two bits represent the number of RNG values to "skip" (1..4), and
 six bits represent the number of RNG values to "take" in the next
 take group (1..64).  Then the indicated number of RNG values are
 skipped, with the last value retained as an "offset," to be
 exclusive-ORed with all members of the upcoming take group.  Then
 another RNG value is produced, exclusive-ORed with the offset, and
 returned.  Subsequent Jitterizer calls collect and offset each RNG
 value until the number of take counts expire and it is time to skip
 more RNG values and produce a new offset.


 Data-Plus-Confusion Combining

 The Cloak2 combiner subsystem takes single-byte plaintext input,
 plus two RNG confusion bytes, plus four more confusion bits, and
 produces a one-byte ciphertext result.

 There are two sequential combinings:  The first combiner is a 256-
 byte Dynamic Substitution combiner [1,2]; the second combining
 level is an arbitrary selection from an array of 16 such combiners.
 All seventeen tables are properly shuffled (without bias) by values
 from the main RNG before data ciphering starts.  (Note that a two-
 level combiner only makes cryptographic sense when at least one of
 the combining levels is nonlinear.  And an array of combiners only
 makes cryptographic sense when the combiners differ; choosing from
 an array of exclusive-OR's would not be particularly helpful.)


 Dynamic Substitution

 Dynamic Substitution is a reversible nonlinear data combiner based
 on a substitution table (or polyalphabetic tables).  In Dynamic
 Substitution, the arrangement in the table changes, potentially
 after every substitution operation.  One way to do this is to
 notice that each substitution operation "selects" one of the
 elements in the table.  If we now use a confusion sequence to also
 select an element, and then exchange the selected elements, we
 get a combiner.  In the table, each substitution element changes
 "at random" whenever it is used, and the more often it is used,
 the more often it changes.  The effect of this is to adaptively
 even out input symbol-frequency statistics.  Also, the elements
 are changed "behind the scenes" (the exchange is not visible until
 future data selects one of the exchanged elements), and this tends
 to hide the confusion sequence.

 Dynamic Substitution is deciphered by maintaining an inverse table,
 and by updating the inverse table dynamically whenever elements in
 the forward table are exchanged.

 Dynamic Substitution is indisputably stronger than the exclusive-OR
 combiner which is used in most stream ciphers:  With exclusive-OR,
 each byte of known-plaintext and associated ciphertext will reveal
 a byte of confusion sequence.  In a similar situation, Dynamic
 Substitution does not expose confusion sequence bytes.

 However, Dynamic Substitution is a tool, not a panacea.  Confusion
 values can be deduced by fully exploring the table before and after
 any particular substitution, knowing that any difference is entirely
 due to element selection by data and confusion.  (Note that two full
 table explorations are required before a single confusion byte is
 exposed by Dynamic Substitution, as compared to the single known-
 plaintext pair needed under exclusive-OR.)  To the extent that the
 cryptographer allows such exploration, the confusion sequence can
 be revealed and the confusion generator attacked.  In the Cloak2
 design, the second combiner and message keys make table exploration
 difficult or impossible.


 OPERATION

 Like most message-key based ciphers, Cloak2 ciphering occurs in two
 steps: first a random message-key value is produced and enciphered
 (or just deciphered); then the "plaintext" message-key is used to
 initialize the data cipher.  Each and every Cloak2 data ciphering
 is totally and almost solely dependent on a unique random 992-bit
 message key.


 Ciphering Message Keys

 Cloak2 124-byte message-keys are ciphered by exclusive-OR using a
 124-byte value which is constant for any particular User Key.  This
 "message key cipher value" is produced by a transient degree-607
 "message key cipher" RNG initialized from the User Key.

 The User Key may be a language phrase, a string of random ASCII
 characters, or an arbitrary-length sequence of bytes.  Often,
 the User Key is obtained by searching an enciphered alias file,
 which is incrementally deciphered by yet another incarnation of
 this same cipher.  The use of an alias file means that long random
 User Keys are a practical possibility.

 The User Key is converted into a 992-bit result using 32 degree-31
 CRC's, and the result initializes a degree-31 Additive RNG.  This
 RNG fills a degree-127 RNG which fills a degree-607 RNG which is
 the message key cipher RNG.  (In all cases and at all times, each
 RNG sequence in Cloak2 is jitterized and nonlinear.)

 The 4-byte-wide jitterized message key cipher RNG output is
 exclusive-ORed into a one-byte result.  A total of 124 such bytes
 is the message key cipher value, which is based entirely on the
 User Key.  The RNG is then reused.


 Producing Message Keys

 Cloak2 124-byte message-key values are produced by a degree-607
 RNG which is initialized from an unknowable value.  The use of a
 cryptographic RNG for message keys seems necessary to support
 massive wildcard ciphering.  (A hardware "really random" source
 of sufficient bandwidth could be used in the exact same cipher,
 but this does not exist within the normal PC environment.)

 The "unknowable value" is the exclusive-OR of the values in the
 main RNG (key-related, thus assumed unknown), the intermediate
 expansion store (also key-related), and a large 992-bit hash
 result.  The large hash combines unknown existing memory values,
 input characters during data-entry, and PC-platform precision time
 readings.  The time spent in the hashing loop is data dependent
 and varies more than 3:1, so the number of samples of a particular
 key-press varies widely.  Human typing delays also affect the hash.
 Precision time values are hashed along with the current character;
 since the hashing loop times vary widely, the time differences
 between hashings also vary widely.

 A single bit-change avalanches the large hash in five passes, and
 in a 386/25 we get about six large-hash passes per second during
 data entry.  Data entry includes both the editable command-line
 as well as User Key, Alias Key, or Batch Key entry and editing.
 The key-related RNG values may well have been expanded from a
 different and very large random key in an alias file.


 Using Message Keys

 The 124-byte "plaintext" message key is expanded into the degree-
 9689 main data RNG using a series of intermediate RNG's as
 described previously.  The resulting jitterized RNG is used to
 shuffle the 17 Dynamic Substitution tables, and then encipher data.
 (A CRC-32 of the data is also enciphered so the correctness of the
 recovered data can be ascertained.)

 Cloak2 needs 17 keyed 256-byte tables and possibly their inverses.
 These are set up in arbitrary sequence by first shuffling an array
 of values 0 through 16, then shuffling the tables in the resulting
 order.  A proper unbiased shuffle uses one-byte combined values
 from the 32-bit-wide jitterized degree-9689 main RNG.

 At any one time, there may be three complete levels of this cipher
 which are separately active:  A Cloak2 batch-file, the alias-file,
 and the data-file (although only the data-file level can possibly
 encipher).  The cipher at each level is the same, and the batch
 files and alias files are simply enciphered as data.  The program
 opens these files as needed, and deciphers them incrementally (and
 potentially repeatedly) in a normally-invisible process.


 COMMENTS

 Perhaps the most controversial part of the Cloak2 design is the
 huge 992-bit size of the keys (the internal key and message key).
 Normally, an 80-bit key would be considered secure against a key-
 exhaustion attack.  A birthday attack on 80-bit message keys could
 find some identical key values with about 2**40 messages; if this
 were important, perhaps we could argue that message keys should be
 160 bits.  So why are Cloak2 keys six times that length?

 The main reason for huge keys is that they support a convenient,
 straightforward design at an acceptable cost.  An extra 132 bytes
 per file seems negligible in the context of modern files, so--in
 effect--there is no reason NOT to use huge keys.  The main expense
 in Cloak2 is the slight delay for RNG expansion to degree-9689,
 and this is not related to base key size.  The expense involved in
 running an Additive RNG is independent of RNG size.  There is of
 course some advantage to having a similar base key size for both
 the User Key hash and the message key itself, since this same base
 size can be expanded in the same way.  There is also advantage in
 using random message key values to directly seed the RNG, since
 this means that we avoid intermediate processing and so need not
 discuss the possible weaknesses of such processing.  Huge keys also
 help to avoid the usual simplistic arguments based solely on key
 size, and so may contribute to a higher level of discussion of the
 design.

 Even with a base state of 992 bits, we can fully seed only a
 relatively small RNG, and an Additive RNG of degree N can be
 completely penetrated with a knowledge of only N elements.  But
 the attack must somehow produce N elements of information from the
 ciphertext, and in Cloak2 we only get one byte of ciphertext for
 each four-byte element from the RNG.  So messages of under 38,756
 bytes simply do not contain sufficient information to develop the
 state of a degree-9689 RNG.  Stream-cipher messages with less state
 than that in the RNG are arguably secure.

 The advantage of an RNG with huge state for messages containing
 more than that state is more ambiguous.  Still, it is not completely
 unreasonable to expect that, if the isolation mechanisms do leak
 information, they may do so at some fraction of the data rate.  This
 could greatly expand the size of the message such an RNG would
 arguably protect absolutely.  Messages larger than that would be
 insecure at some cost level--hopefully high, and repeated for every
 message.  Of course, any cipher can only protect information which
 is otherwise secret, and virtually any other approach would be
 preferable to attacking Cloak2 ciphertext.

 The step-by-step nonlinear expansion to degree 9689 is almost twice
 as expensive as a simple linear expansion in a single buffer using
 the same polynomials.  One could argue, however, that such linear
 expansion would allow one to attack the original RNG state using
 the information in the degree 9689 RNG.  Consequently, RNG expansion
 must be nonlinear.


 STRENGTH ARGUMENTS

 Brute Force on the RNG:  Try each possible RNG state until the
 cipher is broken.

      There are 992 bits of independent RNG state:  Clearly, it is
      only necessary to step through all possible states to "break"
      the cipher.  This is the design strength.


 Brute Force on the User Key:  Try each possible User Key until the
 cipher is broken.  Try most-likely keys first.

      No cipher can do very much about this if a user really wants
      to use a weak key.  The advanced Cloak2 program supports the
      automatic generation of random keys into an alias file, where
      they can be selected and used with a non-secret alias tag.
      This does not completely solve the problem, however, because
      the alias file itself requires a User Key.

      Cloak2 supports User Keys of virtually arbitrary length, but
      even a long key is not necessarily "strong."  Concatenating
      the names of family members will not make a good key.  A
      strong key must be unique and unsearchable, and never written
      down or stored as plaintext in other files.  Cloak2 alone
      cannot assure this.

      Since the alias file User Key must be entered, it can be
      stolen, and so should be changed periodically.  This is easy
      to do by deciphering the alias file under the current key
      and then immediately enciphering under the new key.  But if
      this is not done occasionally, co-workers may have immediate
      direct access to enciphered files.  In most cases, the alias-
      file key will be the weakest part of the system.


 Chosen Key:  Try various keys, adaptively, and compare the resulting
 ciphertext to the real ciphertext, to try and gain insight into the
 correct key value.

      Given that CRC is used to generate the internal RNG base state
      from the User Key, it is hard to see how any particular bit
      arrangement could possibly be preferred, or occur unusually
      often.  This is one big reason for using CRC to process the
      User Key, as opposed to, for example, a "cryptographic" hash
      which has no known strong mathematical basis.

      Note that the resulting CRC state is only used to produce an
      arbitrary message key cipher value, which ciphers arbitrary
      message keys.  It is hard to imagine how changing the User Key
      could possibly produce directed results in the main data RNG,
      for example.  Undirected results are unlikely to be useful.


 Differential Cryptanalysis: Exploit known properties of particular
 substitution tables to produce probable bit changes after one or
 more levels of substitution (this can effectively reduce the number
 of "levels" in an iterated block cipher).

      Differential Cryptanalysis does not appear to apply to this
      cipher, because all Cloak2 tables are "keyed."  That is, all
      Cloak2 tables are unbiased arbitrary permutations created by
      RNG state which is expanded from an arbitrary message key.
      Since proper shuffling algorithms are used, every table
      permutation is equally probable.  This means that the table
      arrangements will be unknown to The Opponent, so the advantage
      of a knowing a particular table arrangement is lost.

      It is certainly true that some table arrangements can be said
      to be "better" than others in terms of block cipher avalanche.
      But these "weak" tables may not be weak when used in stream-
      cipher processing.  And Dynamic Substitution tables continue
      to change during processing, so any possible advantage is at
      best transient.


 Ciphertext Only:  Accumulate a mass of ciphertext material and try
 to find relationships within the data which expose successive
 levels of the mechanism, until the cipher is broken.

      The data flowing through the Cloak2 cipher are extensively
      randomized by two stages of adaptive Dynamic Substitution and
      a substantial jitterized Additive RNG.  It is hard to imagine
      how a statistical relationship could survive this.  Alas, a
      definitive statement would seem to imply a particular defined
      statistic:  We cannot test for the existence of a statistical
      pattern until we know what to test for.


 Known Plaintext:  "Obtain" some amount of plaintext and the
 corresponding ciphertext.  Use this material to develop the
 original cipher state, thus supporting the deciphering of material
 which has not been obtained.

      Cloak2 has a two-level combiner which combines a single byte
      of data with two bytes and four bits of confusion to produce
      a single byte of ciphertext.  Accordingly, a single known-
      plaintext byte cannot, by itself, describe the confusion which
      produced the ciphertext.  Nor do single bytes reveal the
      Dynamic Substitution table states, because the selected table
      element (and indeed--at the second level--the table itself)
      changes at random after use.  In contrast to other stream-
      cipher designs, known-plaintext will not immediately resolve
      the confusion value in a Dynamic Substitution design so that
      the RNG can be attacked.


 Key-Stream Repetition:  Use known-plaintext to develop the values
 from the RNG, then wait for the RNG to cycle, and use those RNG
 values again.

      Cloak2 has RNG's with cycle lengths far too large to repeat
      in practice.


 Birthday Attack on Message Keys:  Accumulate enough ciphertext to
 find two or more messages enciphered with the same message key,
 then try to analyze the state, which must start the same if the
 message key is the same.

      First, identical Cloak2 message keys can only be identified
      under the same User Key.  That is, if the same message key
      value occurs under different User Keys, it will produce two
      different enciphered message key values.  Thus, we will not
      know, externally, that the message keys are in fact the same.

      Next, if we assume that message key values are "random like,"
      the probability of finding two messages with the same huge
      message key is very, very, very small.

      Last, note that the Dynamic Substitution combiner state will
      deviate with each data difference.  As the combiner states
      diverge, analysis of the RNG state would apparently be
      increasingly complex.


 Chosen Plaintext:  Somehow arrange to use the cipher to encipher
 potentially huge amounts of data specifically chosen to reveal
 the cipher.  (It is not clear that this would be a realistic attack
 in the context of normal secret-key end-user-cipher operation.)

      Since message-key values are not under the control of the
      user, there simply is no fixed internal state to be examined
      by using different messages.  And since both the Dynamic
      Substitution and RNG state is dynamic, there is no fixed
      internal state that can be examined in one message either.

      The obvious attack on a single level of Dynamic Substitution
      is to use defined-plaintext repeatedly to traverse the entire
      table both before and then after a particular character is
      enciphered.  This will reveal the confusion byte at the
      combiner so that the RNG can be attacked in the way of the
      usual exclusive-OR stream cipher.  (Note that this requires
      far more work than the exclusive-OR combiner.)

      Cloak2 complicates such attacks in several ways:  1) the RNG
      is large, and attacks on Additive RNG's are probably cubic in
      RNG length;  2) only 20 out of 32 of the RNG bits are available
      even assuming complete combiner penetration;  3) the RNG is
      isolated from attack by the nonlinear "jitterizer" mechanism;
      4) the combiner is two levels deep, making the traversal
      attack complex; and  5) a defined-plaintext attack is not
      particularly useful in the context of a message key which is
      different for each and every message, since the RNG and
      combiner tables will be set up differently each time.

      There is no known effective attack on dual-level-combiner
      message-key cipher like Cloak2.


 CURRENT IMPLEMENTATION

 The current Cloak2 implementation is a relatively small (47K
 including substantial "help" information), relatively fast
 (154 KBytes/sec max. on a 486/DX2-50), command-line program for
 DOS on IBM PC-type machines.  The current design includes an
 interactive command-line entry mode which is easy to use under
 Microsoft Windows.  Inner ciphering loops are written in hand-
 optimized assembly-language for best speed, but the program will
 run on an old 8088 machine, and requires no numeric co-processor.

 The overall cipher design includes extensive key-management
 features to help create, transport, and use large random keys.
 Enciphered alias files allow many secret keys to be kept under
 a single alias-file key, and then selected by public alias.
 This supports a single key-entry by the user, since, if the
 key is entered incorrectly, the alias will not be found and
 ciphering will not occur.

 Aliases inherently support painless key-update, since the user
 continues to use exactly the same alias for each new key.
 New Cloak2 keys can be added to the enciphered alias file
 without deciphering and possibly exposing the existing keys.

 Dated-aliases allow keys to be programmed in advance, and
 automatically used when the date changes.  The ability to access
 old, replaced keys by date supports access to old messages and
 entire archives.  Alias files also support the centralized key-
 management important for business.  As people leave, alias files
 can be extended with new keys, and ordinary users can use the
 exact same aliases and so be impacted little if at all.

 The Cloak2 cipher is a serious commercial design, and is part of
 a growing family of serious cipher engines intended for use by
 software developers.  These designs use various technologies to
 achieve a wide range of speed versus performance tradeoffs in
 software implementation.


 References

 [1]  Ritter, T.  1990.  Substitution Cipher with Pseudo-Random
      Shuffling:  The Dynamic Substitution Combiner.  Cryptologia.
      14(4): 289-303.

 [2]  Ritter, T.  1990.  Dynamic Substitution Combiner and Extractor.
      U.S. Patent 4,979,832.

 [3]  Ritter, T.  1991.  The Efficient Generation of Cryptographic
      Confusion Sequences.  Cryptologia.  15(2): 81-139.

