DNSChat2 - Design ==================== 12/08/2017 Overview ----------- DNSChat V1 (https://projects.bentasker.co.uk/jira_projects/browse/DNSCHAT.html) relied on a Peer-to-Peer model. While that reliance should (in theory) make related queries harder to identify (due to the huge number of possible destinations), it's not particularly convenient to run. The aim of DNSChat2 is to take the underlying concept and implement a Server-Client model, whilst maintaining End-to-End encryption (so that the server cannot be used to compromise the messages being sent). So client's will establish a "session" with a central server, and poll that periodically to check for new messages (and send messages via it). Further down the line, it'd be nice to combine the two concepts, so that a DNSChat2 client could set a notifier that tells other clients to send messages to it directly (at a published location), but that's out of scope for the time being Architecture ============== Server ------- The central server will essentially be a PowerDNS install with a Python back-end to handle the messages. The python backend will write messages etc out to a MySQL database so that state can be tracked (with the additional benefit that PowerDNS can be told to spin up multiple instances of the backend to increase capacity. With a shared database, the "server" could also be a cluster of servers to increase resiliency as well as capacity). Client -------- The client will be a (hopefully) small Python script, with the interface being CLI based. The first incarnation of DNSChat was very much a proof-of-concept so had an incredibly simple interface. For DNSChat2, although it's still a PoC, I'd like to go for a slightly more advanced interface - something more IRC like (later on, if a GUI is wanted, something like HexChat (http://hexchat.readthedocs.io/en/latest/script_python.html) could perhaps be adapted and used) Transport Misc ---------------- Although the primary and default transport will (of course) be DNS, when implementing the section which generates network requests, it should be abstracted out so that it can easily be replaced with a module that (for example) makes HTTP requests. That way the client can more easily be used across multiple protocols (and allow HTTP will allow it to be used using a Tor hidden service as the endpoint). Crypto ========= As with DNSChat1, the primary crypto will be PGP, though in DNSChat2 it'll use PKI rather than symmetric encryption. * Users will publish their public key to the server (to begin with this'll be a case of manually adding in the database, but in the long run there'll probably be an associated web service for signing up/managing account etc). * The server will also maintain a known symmetric key, used to encrypt/decrypt initial control messages (for example, to log in) * Whenever a user logs in, a symmetric session key will be generated for that user's session. The client will use that key to encrypt messages being sent to the server. So each query generated by the client will use layered encryption, giving roughly the following workflow * User1 enters "Hello World" to be sent to User2 * Client encrypts the message with User2s public key * The control characters (message type, who it's for etc. Basically the message metadata) are generated, and encrypted using the session key * Query goes to server * Server decrypts the control data, processes the query and stores the result in the database (so the message content is still encrypted) * User2's client will (at some point) poll to see if there are queued messages * A response message will be built, encrypted with User2's session key and returned * User2's client decrypts the response using the session key * User2's client decrypts the message using his/her private key * Client displays User1> Hello World The session encryption will likely be a simple XOR based cipher, and exists to serve two purposes. * To ensure that any analysis of traffic doesn't simply show the presence of PGP encrypted strings (hence not using XOR for the outer layer) * To provide some in-flight protection to the elements of metadata that must be clear-text for the server to be able to decide whether/how to process the message Session keys will change at each login, but may also cycle at shorter periods. Authentication ================ The workflow used in order for a client to 'Log In' to the service will be as follows * Client generates a message requesting a nonce, and providing it's username/id * Client uses the known symmetric key to encrypt the message and sends to the server * Server decrypts message, generates a random string and encrypts it with the user's public key (stored in the database) * Client signs the nonce with it's private key and returns the result to the server * Server verifies the signature, and if it passes, generates a session key to use. Session key is encrypted with the client's public key (and stored in the DB) and returned to the client * Normal operations commence In order for the back-end to efficiently process login requests, a plaintext control char will likely need to be present to indicate that it's a request using the global key. The aim though, is to try and ensure the queries are less readily identifiable than those in the original DNSChat (https://www.bentasker.co.uk/documentation/security/300-pgp-encrypted-text-chat- via-dns). Control messages ================= The control portion of the message forms the first and second labels in the name being queried, and should be submitted in a hex encoded format. When being formatted for dispatch, the surrounding brackets should be removed from the original json string (to save a couple of bytes) and will be re-added by the server when decoding. This gives a FQDN of the following format 2230222c2030.130211181543504168111001020c5540.130b5016194140372c5841135d4641034330544147145804111025565711181.example.com Where the first two labels are control message portions Control Message Portion 1 format ---------------------------------- Portion 1 is the first label in the queried name: *2230222c2030*.130211181543504168111001020c5540.130b5016194140372c5841135d4641034330544147145804111025565711181.example.com The control message is a JSON array, with the following structure [msgtype, sessionid, ] This is added without encryption (as the sessionid allows the server to find the session key) Control Message Portion 2 format ---------------------------------- Portion 2 is the second label in the name being queried 2230222c2030.*130211181543504168111001020c5540*.130b5016194140372c5841135d4641034330544147145804111025565711181.example.com Again a JSON array [msg num, fragment count, msgid, ] This is encrypted with the session key and then hex encoded Message Types -------------- Within the protocol, message types are indicated by their integer ID, however for the purposes of clarity within any documentation they've been assigned specific names. id Name Description 0 INIT Initiate a session (essentially - log in) 1 ACK Acknowledge - confirm receipt of message IDs. IDs are in message body 2 POL Polling request - check for queued messages 3 POLACK Acknowledge ids in message body, and request list of any outstanding messages 4 KILL Kill the existing session (essentially, log out) Message Body =============== The message body is a JSON array, and (due to DNS label length limits) may be split across multiple queries. The server will handle re-assembling them. The entire JSON array is encrypted using the session key, then hex encoded before the message is split into chunks of up to 63 characters. It's format is msgbody = [ str(msg['ctrl']['nonce']), msg["messagebody"]["msg"], msg["messagebody"]["to"], msg['messagebody']['user'] ] The contents of the second field (msg) contains the text being sent - though this will be in a PGP encrypted format. Notes ======= This design is only an initial pass at the new implementation, work never really started on building anything based upon it. It also doesn't currently address the issue with queries being reasonably easy to identify with a regex - though that's a hard issue to address as the server will need to be able to identify them in order to avoid doing too much processing on "normal" queries.