Asked by:
Creating own encryption algorithms

Hey there.
I'm studying a module about cryptography at the moment. Without any assistance from the college we have to design our own application to encrypt/decrypt a file based with a keyword input, including our own, uniquely designed but possibly flawed algorithms with no direct relation to any publically accessible open source/patended algorithms..
To get top marks asymmetric encryption is required, but I'm not going to be silly and pretend I'm skilled enough to master that in a few weeks.
When considering standard private key encryption, the problem I find is converting the keyword into a key as such. The word has officially got to be between 10 and 40 characters. Converting ASCII to bits shows 1 character as 8 bits, meaning 10 characters = 80 bits, and 40 characters = 320 bits.
So the big problem I've got is right at the start. How can I get from a random number of bits between 80 and 320 (even only) to a standard 128bit key for example, WITHOUT using repetitive padding (easy to crack etc...)
Though this isn't a coding discussion at first thought, it will hopefully turn into that. And rather than asking for an answer, if anyone could assist in the thought process to getting to that answer, it would be much appreciated! I can worry about taking the key and converting into ciphertext once I have this first bit figured I guess. I'm struggling to even start coding without getting this sorted!
Thanks
 Edited by Suggy Tuesday, November 04, 2008 10:53 PM
General discussion
All replies

I'm not sure what you're asking. Are you asking how to convert a supplied encryption key that is between 10 and 40 characters to an 128bit hash? There's many ways. you could perform some sort of algorithm on the data (like multiplying each byte in the data by the next) calculating a 128 number (Decimal values are 128bits for example).
A key doesn't need to be hard to crack, the algorithm that uses the key does.
http://www.peterRitchie.com/blog 

Thanks for the quick replies guys.
First of all, I was asking how to convert a "password" that the user inputs, into the 128bit key. I guess the password is the key, so converting that into a 128bit hash as Peter says.
Secondly, apologies for my bad order of things. Apart from the theories behind cryptography there has been no real teaching involved so I'm trying to research the whole thing while designing it.
Ok so I should just get cracking with figuring out what to do with the plain text before worrying about getting the key... That I can start to do.

A key is just a string of gibberish. It can even be "mama" or something. It is completely arbitrary and you can use mnemonics to make a key or random number generator to pick characrters. You will need two keys: one private, one public. There is no difference between them visually: the same gibberish. The difference will transpire in the method itself. Once you created a method you use your public key to encrypt your meaningful text and send the encrypted message to what they call: Alice in quantum mechanics. Alice uses her own private key which only she posseses and decrypts the message. If she wants to send a message back she cannot use the same pair of keys, mind you. It is all assymentrical.
You have to worry neither about the keys nor the text. Worry about the method. There are excellent descriptions in wikipedia, find one and see how they come up with theirs. You may use that as a springboard.
AlexB Edited by AlexBB  Vista Ult64 SqlSer64 WinSer64 Wednesday, November 05, 2008 1:39 AM

I understand the concept of public/private encryption, though am confused as to the implementation. For that reason i was considering symmetric (private) encryption only.
How about a more detailed Vigenere algorithm which takes the user password, the plain text, as well as a third variable such as a system specific variable (hard drive serial or something). That way, Malice would need the users password as well as the hard drive serial to be able to decipher the text. Just adds the complexity and the number manpiulation would increase the difficulty to crack....
So Vigenere:
Ci = Pi + Ki
which is crackable and insecure. But:
Ci = Pi*Ki+(Zi*Ki) for example..... That way Zi and Ki is required

see b;og for actaul application using encryption http://christianzachristiancvdh.spaces.live.com/default.aspx?mkt=enGB&partner=Live.Spaces download the source and play when settng user name & password.
I hardcoded the key in the app but you will get the idea. 
Suggy said:
So Vigenere:
Ci = Pi + Ki
which is crackable and insecure. But:
Ci = Pi*Ki+(Zi*Ki) for example..... That way Zi and Ki is required
Actually, there is no difference between the two. If you factor it out it becomes (Pi+Zi)*Ki, and that just means your Pi is offset by Zi. Same effort needed to crack.
BTW, if your instructor expects you to come up with unbreakable (or even difficult to break) encryption, they are being naive. Experts spends months/years coming up with methods that turn out not to work, to expect you to do so in a short time is ridiculous.
Ron Whittle 
Unfortunately my instructor is indeed expecting us to come up with a strong method. Apparently to get good marks the encryption algorithm must be getting as good as DES for example, though obviously not all the way. So for students (i.e me) that has never done crypotgraphy before and only few modules in C/Java, we're now in the deep end!
Ok so that example with Zi may not have been great. How about another form of manipulation between the 3? Or am I going the wrong way abou things.
Christian, read through that application but found calls for the Rijndael class. We have to make it our 100% own algorithm. Thanks Edited by Suggy Wednesday, November 05, 2008 10:29 AM

BTW, if your instructor expects you to come up with unbreakable (or even difficult to break) encryption, they are being naive. Experts spends months/years coming up with methods that turn out not to work, to expect you to do so in a short time is ridiculous.
Ron Whittle
Quantum encryption is unbreakable in principle. Rijndael is unberakable in practive provided you use long enough keys. MS has classes that allow 1 kilobit key length.
What the OP has to do is to come up with his "smart" algorithm and ask the professor to beak it in his spare time until the end of the term.
Using hardware for encryption is considered not a good idea, I believe.
AlexB 
I agree with Ron. As an exercise in learning hashing/encryption it's probably useful. But, to make you think that it's good to write your own hashing/encryption is very bad. As Ron says, people spend years (and companies stake their entire business) on encryption algorithms.
Likely no matter what you do algorithmically will be "easy" to crack. If the exercise is to come up with a hardtocrack algorithm with little or no expertise in encryption then I would consider taking another course...
http://www.peterRitchie.com/blog 


Peter Ritchie said:
I agree with Ron. As an exercise in learning hashing/encryption it's probably useful. But, to make you think that it's good to write your own hashing/encryption is very bad. As Ron says, people spend years (and companies stake their entire business) on encryption algorithms.
Likely no matter what you do algorithmically will be "easy" to crack. If the exercise is to come up with a hardtocrack algorithm with little or no expertise in encryption then I would consider taking another course...
http://www.peterRitchie.com/blog</A< p>
So the problem becomes how do I get a good grade I guess.
Another course isn't a possibility. This is a module as part of a Network Computing degree, in my final year (4th). 7 weeks in is way too late to be changing modules too.
In the lecturers words, "Vigenere would barely get you a pass. It has to be much harder to break, while asymmetric encryption going towards RSA standards would be enough for a very good grade with a good report".
I'm aiming for that in between I guess. Unfortunately when even experts are just advising me to take another course, it simply increases my worry and stress about the assignment :( 
I'm aiming for that in between I guess. Unfortunately when even experts are just advising me to take another course, it simply increases my worry and stress about the assignment :(
Don't take it that hard. Peter Ritchie is a great guy, he is an expert but not in encryption. Take it easy. I think your teacher is looking for some originality in design that's it, they want you t show creativity, come up with an elegant solution. It may not necesserily be immediately unbreakable. You can show them that given more time you will be eable to improve upon it. Save the arguments presented here that following the assignment to the letter will take years for a team of mathematicians to accomplish with a rather uncertain outcome.
AlexB 
AlexBB said:
I'm aiming for that in between I guess. Unfortunately when even experts are just advising me to take another course, it simply increases my worry and stress about the assignment :(
Don't take it that hard. Peter Ritchie is a great guy, he is an expert but not in encryption. Take it easy. I think your teacher is looking for some originality in design that's it, they want you t show creativity, come up with an elegant solution. It may not necesserily be immediately unbreakable. You can show them that given more time you will be eable to improve upon it. Save the arguments presented here that following the assignment to the letter will take years for a team of mathematicians to accomplish with a rather uncertain outcome.
AlexB
I've emailed my tutor with my concerns. I think its unreasonable to make a good grade impossible so hopefully he'll supply some extra information or help.
Will let you know updates :) 
Take a look at Elliptical Curve Cryptography. You can probably implement a simplified version, and I doubt your instructor will break it anytime soon.
Ron Whittle 
Hey guys. Update from the lecturer:
"it is actually reasonably straightforward to produce a symmetric algorithm (public key algorithms are a different matter) that would be extremely difficult to break  I am not saying unbreakable, but one which would require access to significant resources to break and a significant investment of time on the part of accomplished cryptanalysts. It could be unbreakable for individuals (even experts without access to considerable computing power or a long period of time) or normal commercial organisations. Students who have got first class marks in this assignment in previous years, have been able to develop algorithms that would prove difficult to break in this sense. [As a clue you should not restrict yourself to using a single technique]"
So I guess based on that just doing a few vigenere modifications and some other basic methods on top of that will do?
Right just have to figure out how to code it! 
Your lecturer is fooling himself. I'm sure he believes this, but that doesn't make it true. Combining methods usually results in something that is less secure, not more. I'm guessing he has some simple tools for cracking ceaser, vigenere and possibly enigma. If you pass those, you get a good grade :)
Ron Whittle 
Am tempted to put in a complaint about all this to the University. In the same module, a different lecturer was teaching us that there are only 3 classes of IP addresses (which may be on the exam!!!) and no more. Also saying that UTP is max 10mbps unless using.. wait for it... "optical FIBBER".
Glad my tuition fees are going to good to use.
So back on topic, he does seem like the type of guy that would just stick it in some simple bit of software and see what the results are. Wonder if I can get my hands on some to test my own algorithms with. I guess if I use vigenere with that 3rd variable, the tools used to crack vigenere wouldn't spot it if I ensured it can't be factored down.. and then with the results do some other method.
Tried looking at enigma. Gave me a headache though it was at 2am. Will try again tomorrow. 
Hey guys.
Could really do with a hand here.
So I'm going with a variation of the hill cipher.
I've implemented the encryption algorithm, but in order to decrypt and I need to find the inverse matrix.
So I have a 3 x 3 key for example k[],
1, 5, 7
3, 4, 2
8, 9, 3
(random, havent tested if that matrix can be inverted but just typed this for example)
int p1 = b;
int p2 = y;
int p3 = e;
int ciphertext1;
int ciphertext2;
int ciphertext3;
I then do my encryption algorithm
ciphertext1 = k[1]*p1+k[2]*p2+k[3]*p3;
ciphertext2= k[4]*p1+k[5]*p2+k[6]*p3;
ciphertext3= k[7]*p1+k[8]*p2+k[9]*p3;
So that matrix is started in
int k[] = new int[9];
So how can I create inversek[] which inverses k to use the same formula with the decryption method! Also will I need to create k in a proper matrix array so k[i,j] to do this?
Really struggling on this!

havent tested if that matrix can be inverted but just typed this for example
That matrix can be inverted since it has a nonzero determinant (6). in order to invert your matrix you should compute Adjugate and divide it by the determinant (Cramer rule which is efficient for small matrices and yours is small). A complete solution for a 3D matrix is given here. It is very simple.
AlexB 
Hmmm I'm really struggling on being able to find the inverse matrix for a 5x5 and coding this in C#!
I've read through a few different methods though they are all rather complex, and my maths skills (though quite reasonable) isn't quite going that far...
I'm wondering if I should just completely change my ideas or keep trying to work this one out.
So if we have matrix[25] i need to find inverseMatrix[25] (or 5,5 instead of 25 for both)
I just can't figure out a formula that will work! :( Managed to code the entire application except for this algorithm.... Including the transfer between the passphrase and the 25 digit key securely. 
You have to make up you rmind as what rank your matrix will be. If it is 25 (I presume the rank of the matrix is equal to the number of rows/columns, otherwise it will be reduceable) then you better use Gauss method. I can post a link or try to sketch a chunk of code for that. It should be a rather simple iterative or perhaps recursive procedure .
AlexB 
It's definitely a 5 x 5 matrix.
Bearing in mind I configured it as a single array with 25 characters rather than a double array with 5 in each...
Any help with some code to create an inverse matrix would be greatly appreciated. I'm very comfortable with fors/ifs/whiles etc etc.. its just the maths of this that I'm struggling with and Gauss is going a bit over me at the moment... 
How are the elements listed: by columns or rows? This will be by rows (first row first, tehn the second, etc):
a11 a12 a13 a14 a15
a21 a22 a23 a24 a25
a31 a32 a33 a34 a35
a41 a42 a43 a44 a45
a51 a52 a53 a54 a55
That is: a11 a12 a13 a14 a15 a21 a22 a23 a24 a25.....
AlexB 
My contribution to this comes in two phases, and is a bit more abstract.
Perhaps the best place to start is with the notion of sboxes (substitution boxes). A lot of modern symmetric ciphers are based on sboxes, including DES/3DES and Blowfish/twofish. sboxes are basically a giant matrix of bit substitutions for plain text. Once you have the key and plain text, you can get the cipher text by looking up the substitutions, and conversly, you can use the key with the ciphertext on the same sbox to get the plaintext substitutions. There are a few principles you could take from different implementations. For example, DES has a fixed sbox, and the key is mangled with the input to get the correct substitution bits. On implementations like Blowfish/twofish, the sbox is dynamically generated by running the key bits repeatedly through a simple function to create a feistel network. Even a simple function like the one used by blowfish can be more than adequate when the sbox is dynamic and is designed to be resistant to linear and differential cryptanalysis. In fact, the original paper on blowfish is a very good primer on how the process in general works, reguardless of the actual implementation.
So the second part is then how to feed a "password" into a system to generate a key. I think this was something you were hung up on previously in the thread. In many ways, this is going to boil down to hashing algorithms, which take an arbitrary length of input and respond with output that is:
* fixed length
* always the same result if the same input is used
* as collisionfree as possible, in other words, differing input should never result in the same output
* vastly different result even when similar (but not exactly the same) input is used, which makes the result resistant to approximation and reverse engineering
In other words, if I, as an attacker, guess at the password and hash "HowNowBrownCow" and look at the result, I shouldn't get any clues that the words Brown and Cow are in the real password, even if my guess isn't an exact match. Simplistic example, but you get the idea.
With that in mind, I have to stop and say that generally, passwords or passphrases are rarely used as encryption keys. This is generally a bad idea for a variety of reasons. So the emphesis is then on how to derrive a safe key based on a password, if you are determined not to use a random key. And that's where I have to say that normally, the key derivation process is completely separate from the encryption algorithm. To get some interesting perspectives, you might want to look up RFC 2898, which gives some examples on this type of a system. The takeaway from this last paragraph is that deriving the key based on a passphrase could and should be a different algorithm, which has little to do with your encryption algorithm. And if the instructor buys into that (which he should), then you can use a method like RFC 2898 outlines to get the key from a passphrase, and then your encryption algorithm can assume that it will always get a fixedlength set of bits as a key, and you can concentrate strictly on that algorithm.
Rob Teixeira 
Rob threw a monkey wrench into the thread:) Make up yur mind as to what you want to do. I have already begun writing the code. Not that simple as I thought but I am able to finish it albeit not tonight. I've done a more complicated things in my life. If you are to go his way, I will suspend it.
AlexB Edited by AlexBB  Vista Ult64 SqlSer64 WinSer64 Monday, December 01, 2008 11:22 PM

Sorry, I didn't mean to throw any kind of wrench into the thread :)
But I figured some of that might actually make his job easier (not to mention safer, even from an academic point of view). This way, he could use an existing hash algorithm to get the key from the passphrase, and his own encryption algorithm could always assume it would get a fixedlength key, the way most ciphers work in real life.
Rob Teixeira 
Thanking all the help been given so far :)
Remembering that this iis meant to be a "simple" assignment according to the lecturer, I'm going to try and avoid sboxes. If anything, I'm getting too complex even trying just the hill cipher as it became apparent this evening that using the inverted matrices, due to decimal places in the results, may provide impossible for decryption.
I still want to go with the matrices though if my theory fails, then I'll have to refer back to a more basic method such as a modification of the vigenere to combat known problems with it.
Saying that I still want to try and attempt to complete the hill cipher while using a 5 by 5 matrix.
Alex... My array is pretty simple:
keyMatrix[25]
So the matrix is simple
keyMatrix[0] keyMatrix[1] keyMatrix[2] keyMatrix[3] keyMatrix[4]
keyMatrix[5] etc....................
So it's a logical matrix rather than a configured double array.
Thanks 

AlexBB said:
OK I got the message but too tired tonight to proceed. Will try again tomorrow.
AlexB
Much appreciated :) If it's easier discussing this on msn I don't mind doing so:
<removed>
without the obvious.. Edited by Suggy Tuesday, December 02, 2008 11:50 AM




AlexBB said:
OK, I debugged what I believe is 1/3 of the routine. It is to make an upper triangular matrix from the source matrix by Dolittle metod. It will take me some time still.
AlexB
I'm getting more and more concerned that using a 5 x 5 matrix full of numbers between 100 and 300 for encryption, when using the inverse matrix the results will not be as expected...
If it's time to let this go and change my method just say... I now have a week to complete this unfortunately. Got all the paperwork done.. Just need to fix the algorithm. 
I am not under a deadline:) I am sorry. I will try my best. It is not that difficult but some concentration and a very clear head are a must. The logic of matrix calculus is complicated. The method/routine is generalized, you will be able to inverse a matrix of any rank, however, the amount of computation may go up exponentially. Actaully there is a formula in wikipedia that tells you how the comp time depends on the rank.There is a remote chance I will finish it tomorrow. I won't be a bad idea if you could find a sample 5x5 matrix with a known inverse to verify. I will try to verify the veracity, of course, by multiplcation. If the resultant product is unit, then the method is correct. I may need it for my own needs eventually, this is why I am doing it. I used to have all them in another language that is not .NET related, unfortunately: VFP.
AlexB 
AlexBB said:
I am not under a deadline:) I am sorry. I will try my best. It is not that difficult but some concentration and a very clear head are a must. The logic of matrix calculus is complicated. The method/routine is generalized, you will be able to inverse a matrix of any rank, however, the amount of computation may go up exponentially. Actaully there is a formula in wikipedia that tells you how the comp time depends on the rank.There is a remote chance I will finish it tomorrow. I won't be a bad idea if you could find a sample 5x5 matrix with a known inverse to verify. I will try to verify the veracity, of course, by multiplcation. If the resultant product is unit, then the method is correct. I may need it for my own needs eventually, this is why I am doing it. I used to have all them in another language that is not .NET related, unfortunately: VFP.
AlexB
I'm afriad I don't have a known 5 x 5 with the result.. This is the area I'm struggling on as it means I can't even test my theory without programming a formula to work out the inverse haha.
I'm not meaning to rush you at all :) it's more of a risk management thing. Tomorrow I'll work on a more basic algorithm that I can definitely reverse. Meanwhile it would be interesting to see your fnidings so we can see if the algorithm is a fail or not!
Thanks 
Hey again.
Just in case the matrix algorithm is an epic fail, I've created a much more basic vigenere method which uses a 128 bit key.
The issue I have is reading from a file and creating an array, with each section having 3 digits from the text file.
So in other words lets say input.txt is:
123456789
I want
cipher[0] = 123;
cipher[1] = 456;
cipher[2] = 789;
I've got as far as reading the entire file into a string. So I have:
string cipherText = 123456789;
so now how do I split that up into the above array.
Note that the file will ALWAYS be divisible by 3. So it will always have numbers in blocks of 3.
Pseudo Code (fake code) for what I need:
for cipher[x], until x is (cipherText.Length/3), x++
{
cipher[x] = cipherText[a]cipherText[b]cipherText[c];
a +3;
b +3;
c +3;
}
So really I guess I need to know how to put a b and c together!
Thanks 
My suggestion is to let the cipher deal with byte arrays alone, and not worry about file IO or encoding strings. If you separate both concepts, then things get a little easier.
So now, the cipher accepts byte arrays, and spits out byte arrays. Easy.
When you want to save to a file, and want it to be text (rather than binary data), simply call Convert.ToBase64String(cipherByteArrayResult), which returns a Base64encoded string that represents the byte array data. This is a very standard way of representing binary data as text, and doesn't require you to translate or encode anything yourself (for example, all binary data in XML is represented this way). Again, easy.
When you want to read the text from file and convert it back to a byte array to feed into the cipher, simply call Convert.FromBase64String(cipherText).
I should mention that Base64 encoding only deals with singledimention byte arrays, but it should be straightforward to break it out into multidimentions/matrixes from there.
Rob Teixeira 
Fantastic.
Thanks for all your help guys. I just finished (at 2am) developing my application.
You give it a .txt file and a password, it gives you a file just full of numbers...
For the deryption you do the same, and it either gives you a file with the exact duplicate of the original, or if the wrong password is entered (even just 2 digits out) it gives COMPLETE junk...
Mission accomplished I feel. This should get me a decent grade though maybe not top. It's a standard stream cipher I guess so we'll see.
Anyone know any tools I can test the strength of the ciphertext with?
Anyway thanks for everyones input. It all got entered someway or another :) 
It seems you've taken a different path, correct? My routine that is supposed to invert a matrix of arbitrary rank is 90% done. I completed decomposition of the matrix into two triangular matrices. It is a major step. The next step is to invert them which is much easier in case of the triangulars. If I have time today I will finish it. If not it will be done tomorrow. If you don't need it anymore, I may be not in a hurry.
P.S. I just googled for triangular invert add found a C# routine from the University of Tennessee that inverts complex triangulars. You can use that. The product of lower triangular inverse and upper triangular inverse is the inverse of the matrix in question. Thus the problem of matrix inverse programmatically may be considered solved.
AlexB 
Cryptanalysts have a variety of tools I'm sure, but I think the most valuable analysis comes from manually probing algorithms for weaknesses. A lot can be done by just mathematically dissecting the algorithm to see if it's possible to come up with a way to break keys at less then bruteforce time. This could happen, for example, if they detect classes of week keys that lead to collisions, or by determining that it's possible to look at the output and somehow mathematically factor the key, or at least parts of it. Sometimes, this analysis comes about through simple experimentation by running known keys and data through the algorithm looking for how the output is affected with by small intentional alterations. In any case, each algorithm is to some degree unique, so the best an automated tool can do is look for classes of common attacks, but real and specific weaknesses are usually discovered somewhat manually. That was the case for DES, for example, when people found that a combination of its static sboxes with certain "weak" keys (bit combinations) could easily be cracked.
Having said that, one thing to be careful of is that stream ciphers (those that operate on a single byte at a time) are inherently weaker than block ciphers (which operate on multiple bytes at a time). That's because block ciphers can manipulate, induce, and combine bits from multiple characters of the plain text. What that does is prevent small repetative sequences of plaintext input from providing reliable repetative sequences of ciphertext output, which is much easier to analyze. In the worst case scenario, an attacker could induce a piece of known text, look at the result, and deduce the contents of other ciphertext purely from the output patterns.
Rob Teixeira 
Rob Teixeira said:
Cryptanalysts have a variety of tools I'm sure, but I think the most valuable analysis comes from manually probing algorithms for weaknesses. A lot can be done by just mathematically dissecting the algorithm to see if it's possible to come up with a way to break keys at less then bruteforce time. This could happen, for example, if they detect classes of week keys that lead to collisions, or by determining that it's possible to look at the output and somehow mathematically factor the key, or at least parts of it. Sometimes, this analysis comes about through simple experimentation by running known keys and data through the algorithm looking for how the output is affected with by small intentional alterations. In any case, each algorithm is to some degree unique, so the best an automated tool can do is look for classes of common attacks, but real and specific weaknesses are usually discovered somewhat manually. That was the case for DES, for example, when people found that a combination of its static sboxes with certain "weak" keys (bit combinations) could easily be cracked.
Having said that, one thing to be careful of is that stream ciphers (those that operate on a single byte at a time) are inherently weaker than block ciphers (which operate on multiple bytes at a time). That's because block ciphers can manipulate, induce, and combine bits from multiple characters of the plain text. What that does is prevent small repetative sequences of plaintext input from providing reliable repetative sequences of ciphertext output, which is much easier to analyze. In the worst case scenario, an attacker could induce a piece of known text, look at the result, and deduce the contents of other ciphertext purely from the output patterns.
Rob Teixeira
Rob,
Thanks for your input. My original algorithm was looking at using a matrix in a block cipher for the reason you specified.
The one I've developed is a stream cipher. A modification of the Vigenere. The only reason frequency analysis cracks this is with a short password (woven frequency analysis). I've used an algorithm to create a very long key meaning, you'd need over 100,000 characters to do any useful frequency analysis. The ciphertext is a text file full of numbers too, meaning the attacker can't seperate punctunctation or crack common words.
It's not amazing. It can prob be cracked by experts quickly. I feel it's enough for a good grade at least though but I guess we'll find out!