Recent Blog Posts http://tadas.vilkeliskis.com/feed.atom 2012-01-16T00:00:00Z Recent blog posts Werkzeug Row Level Locking in Django http://tadas.vilkeliskis.com/2012/1/16/django-row-level-locking 2012-01-16T00:00:00Z Tadas Vilkeliskis <p>In one of my fixes that I was working at work I had to implement row level locking in Django. Current stable, 1.3, version of Django does not have built-in capability for row level locking on InnoDB tables. The good news are that the development version already has an update in <cite>QuerySet</cite> API that will let you use <cite>select_for_update</cite> method to acquire a write lock on rows matching your query. If you can use development version for your project you may stop reading and go upgrade Django, otherwise I will see you at the bottom of the page.</p> <div class="section" id="locking-your-models"> <h2>Locking your models</h2> <p>Since you're building your web application with Django it's most likely that you are using Django ORM to access database. However, in order to be able to perform locking queries we will have to write SQL queries ourselves. Which is much easier than you might think (unless you need some complex join across million tables). Assuming you have <cite>Article</cite> model you acquire row level lock as follows:</p> <div class="highlight"><pre><span class="n">article</span> <span class="o">=</span> <span class="n">Article</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">raw</span><span class="p">(</span><span class="s">&quot;SELECT * FROM articles WHERE id = 123 FOR UPDATE&quot;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> </pre></div> <p>When running the above query an instance of your Django web application will remain blocked until the instance that acquired the lock first releases it or lock wait timeout occurs. One thing to note here is that locking will only work if your code is executing inside of a transaction. Ultimately, you want to have something like this:</p> <div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.db</span> <span class="kn">import</span> <span class="n">transaction</span> <span class="kn">from</span> <span class="nn">MySQLdb</span> <span class="kn">import</span> <span class="n">OperationalError</span> <span class="k">def</span> <span class="nf">hack_article</span><span class="p">(</span><span class="n">article</span><span class="p">):</span> <span class="c"># Some magical code that does something with the article.</span> <span class="k">pass</span> <span class="nd">@transaction.commit_on_success</span> <span class="k">def</span> <span class="nf">function</span><span class="p">():</span> <span class="k">try</span><span class="p">:</span> <span class="n">article</span> <span class="o">=</span> <span class="n">Article</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">raw</span><span class="p">(</span><span class="s">&quot;SELECT * FROM articles WHERE id = 123 &quot;</span> <span class="s">&quot;FOR UPDATE&quot;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span> <span class="c"># Handle not found</span> <span class="k">return</span> <span class="k">except</span> <span class="n">OperationalError</span><span class="p">:</span> <span class="c"># Handle lock timeouts</span> <span class="k">return</span> <span class="n">hack_article</span><span class="p">(</span><span class="n">article</span><span class="p">)</span> </pre></div> <p>The write lock will get released once the transaction is committed or aborted, in the above case this happens once we leave the <cite>function</cite>. There is actually one case (maybe there are more, let me know if I am wrong) where the lock can get released before the transaction is committed or aborted -- when a new transaction is started inside of another transaction. So, the above example would be prone to race conditions if our <cite>hack_article</cite> function looked like this:</p> <div class="highlight"><pre><span class="nd">@transaction.commit_on_success</span> <span class="k">def</span> <span class="nf">hack_article</span><span class="p">(</span><span class="n">article</span><span class="p">):</span> <span class="c"># Some magical code that does something with the article.</span> <span class="k">pass</span> </pre></div> <p>The reason for this is that locks don't propagate through nested transactions in MySQL.</p> </div> <div class="section" id="wrapping-up"> <h2>Wrapping up</h2> <p>Row level locking in Django is easy as 1, 2 and 3 :D</p> <ol class="arabic simple"> <li>Make sure that your tables use InnoDB engine.</li> <li>Acquire locks inside of transaction.</li> <li>Make sure that you are not creating new transactions inside of another transaction which holds the lock.</li> </ol> <p>That's all.</p> </div> How to establish a P2P connection http://tadas.vilkeliskis.com/2010/12/8/how-to-establish-a-p2p-connection 2010-12-08T00:00:00Z Tadas Vilkeliskis <p>I am working on this cool project which I don't want to disclose yet and the essential part for this project is to create a peer-to-peer connection between two computers. Creating a connection is not that difficult; however, it gets complicated when both computers are sitting behind a NAT device. The NAT device will create a private IP address space and be responsible for routing packets into and out from the private network. This means that an IP address and a port number associated with a particular service are not directly accessible from other computer networks and all inbound data packets must be routed by the NAT device. So how do you create a P2P connection when the NAT device is &quot;blocking&quot; access to a peer? The technique I will cover here is very elegant yet very simple --- it's called <a class="reference external" href="http://en.wikipedia.org/wiki/Hole_punching">hole punching</a>. Hole punching a technique that opens a temporary port on the NAT device which is directly mapped to a port on the internal network. There are two types of hole punching: TCP and UDP. In this post I will only cover UDP hole punching because it has better support (82% for UDP and 64% for TCP according to this <a class="reference external" href="http://www.brynosaurus.com/pub/net/p2pnat/">source</a>) and is easier to implement. Let me illustrate what a UDP hole punching is and then take a look at the implementation.</p> <p>(the illustration was lost when i moved to posterous, but you can search google images for udp hole punching to get the idea).</p> <p>As you can see there are only four steps required in order to create a P2P connection. At first, peers have to register with a broker. The broker is a computer with a public IP address; registration process is very simple --- the broker stores peer's id, its external IP address and external port number. When a peer connects to the broker in step 1 the NAT device creates a temporary mapping from the external port to an internal port, that is to say, a hole is punched though the NAT. Next, second peer wants to establish a direct connection with the first peer, but he does not now what is the port number and IP address of that peer. So, it contacts the broker and asks for peer 1 credentials. Once the credentials are received the second peer can make a direct connection to the first peer through the temporarily allocated external port on NAT 1. In theory it does not look hard at all; in practice --- it is exactly the same. I had a trouble finding some simple code examples showing how to implement hole punching, so I wrote a proof of concept code in python to test whether this technique works.</p> <div class="highlight"><pre><span class="c">#</span> <span class="c"># Broker</span> <span class="c">#</span> <span class="kn">from</span> <span class="nn">socket</span> <span class="kn">import</span> <span class="o">*</span> <span class="n">HOST</span> <span class="o">=</span> <span class="s">&#39;&#39;</span> <span class="n">PORT</span> <span class="o">=</span> <span class="mi">9999</span> <span class="n">s</span> <span class="o">=</span> <span class="n">socket</span><span class="p">(</span><span class="n">AF_INET</span><span class="p">,</span> <span class="n">SOCK_DGRAM</span><span class="p">)</span> <span class="n">s</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="n">HOST</span><span class="p">,</span> <span class="n">PORT</span><span class="p">))</span> <span class="n">peers</span> <span class="o">=</span> <span class="p">{}</span> <span class="k">while</span> <span class="mi">1</span><span class="p">:</span> <span class="n">data</span><span class="p">,</span> <span class="n">addr</span> <span class="o">=</span> <span class="n">s</span><span class="o">.</span><span class="n">recvfrom</span><span class="p">(</span><span class="mi">128</span><span class="p">)</span> <span class="k">print</span> <span class="n">addr</span><span class="p">,</span> <span class="n">data</span> <span class="n">peers</span><span class="p">[</span><span class="n">addr</span><span class="p">]</span> <span class="o">=</span> <span class="bp">True</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">peers</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">1</span><span class="p">:</span> <span class="n">peer1</span> <span class="o">=</span> <span class="n">peers</span><span class="o">.</span><span class="n">keys</span><span class="p">()[</span><span class="mi">0</span> <span class="n">peer2</span> <span class="o">=</span> <span class="n">peers</span><span class="o">.</span><span class="n">keys</span><span class="p">()[</span><span class="mi">1</span><span class="p">]</span> <span class="n">s</span><span class="o">.</span><span class="n">sendto</span><span class="p">(</span><span class="s">&quot;peer &quot;</span> <span class="o">+</span> <span class="n">peer2</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="s">&quot; &quot;</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="n">peer2</span><span class="p">[</span><span class="mi">1</span><span class="p">]),</span> <span class="n">peer1</span><span class="p">)</span> <span class="k">else</span><span class="p">:</span> <span class="n">s</span><span class="o">.</span><span class="n">sendto</span><span class="p">(</span><span class="s">&quot;pong&quot;</span><span class="p">,</span> <span class="n">addr</span><span class="p">)</span> <span class="c">#</span> <span class="c"># Peer</span> <span class="c">#</span> <span class="kn">from</span> <span class="nn">socket</span> <span class="kn">import</span> <span class="o">*</span> <span class="n">BROKER_HOST</span> <span class="o">=</span> <span class="s">&quot;broker.host.com&quot;</span> <span class="n">BROKER_PORT</span> <span class="o">=</span> <span class="mi">9999</span> <span class="n">BROKER_ADDR</span> <span class="o">=</span> <span class="p">(</span><span class="n">BROKER_HOST</span><span class="p">,</span> <span class="n">BROKER_PORT</span><span class="p">)</span> <span class="n">sock</span> <span class="o">=</span> <span class="n">socket</span><span class="p">(</span><span class="n">AF_INET</span><span class="p">,</span> <span class="n">SOCK_DGRAM</span><span class="p">)</span> <span class="c"># to support two peers on the same machine</span> <span class="k">try</span><span class="p">:</span> <span class="n">sock</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="s">&#39;&#39;</span><span class="p">,</span> <span class="mi">9900</span><span class="p">))</span> <span class="k">except</span><span class="p">:</span> <span class="n">sock</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="s">&#39;&#39;</span><span class="p">,</span> <span class="mi">9901</span><span class="p">))</span> <span class="n">sock</span><span class="o">.</span><span class="n">sendto</span><span class="p">(</span><span class="s">&quot;ping&quot;</span><span class="p">,</span> <span class="n">BROKER_ADDR</span><span class="p">)</span> <span class="k">while</span> <span class="mi">1</span><span class="p">:</span> <span class="n">data</span><span class="p">,</span> <span class="n">addr</span> <span class="o">=</span> <span class="n">sock</span><span class="o">.</span><span class="n">recvfrom</span><span class="p">(</span><span class="mi">128</span><span class="p">)</span> <span class="n">new_data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">split</span><span class="p">()</span> <span class="k">if</span> <span class="n">new_data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s">&quot;peer&quot;</span><span class="p">:</span> <span class="n">addr</span> <span class="o">=</span> <span class="n">new_data</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="n">port</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">new_data</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span> <span class="n">sock</span><span class="o">.</span><span class="n">sendto</span><span class="p">(</span><span class="s">&quot;resp test&quot;</span><span class="p">,</span> <span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">port</span><span class="p">))</span> <span class="k">elif</span> <span class="n">new_data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s">&quot;resp&quot;</span><span class="p">:</span> <span class="n">sock</span><span class="o">.</span><span class="n">sendto</span><span class="p">(</span><span class="s">&quot;what goes around comes around&quot;</span><span class="p">,</span> <span class="n">addr</span><span class="p">)</span> <span class="k">print</span> <span class="n">addr</span><span class="p">,</span> <span class="n">data</span> </pre></div> <p>This is a very simple code. The broker supports only two peers and must be started on a computer that has a public IP address. A peer will punch a hole by sending a ping message to the broker. In this example peers don't request the address and the port number of another peer, but instead the broker will send this information to the second peer. If everything worked out, one peer will see a &quot;test&quot; message on its screen, whereas, another one will see &quot;what goes around comes around&quot;. I was able to test this code where one peer, that would be me, was in the US and another one, my friend, was in Lithuania, and it works.</p> I am not dead, but only trying something new! http://tadas.vilkeliskis.com/2010/8/25/i-am-not-dead-but-only-trying-something-new 2010-08-25T00:00:00Z Tadas Vilkeliskis <p>It's been a while since my last post. A few things (small) has changed in my life since then, including the type of information I consume which shifted drastically. I am a very technology oriented person and this makes me kinda geeky by definition. All my friends know that I am good with computers and programming. For those who don't know me, do you think it's because I study hard at college? No, it's because in my childhood I was so attracted to computers that video games were not cutting it anymore, so I found programming and I got hooked. I read so many books and articles on programming and that's what made me so good at it. However, now I try to read technical material as little as possible, and it's not because I don't want to grow, it's because I changed my priorities and I have to grow in a different area of interest. My current area of interest is business. Why did I made such a change? Let's go back to one of my dreams first. About four years ago when I entered the university to study computer science, I thought that college is the last step I have to take, and after I graduate great job awaits me at a big company. Does this sound familiar to you? The main reason why I switched from one topic to another is that I don't like following rules, and a big company will have a huge rule book. So, I changed my dream of becoming a good programmer at a big company to becoming a business owner. For this to work I have to totally change my environment. I can't remember what I read but I memorized the following quote: spend one hour every day on a topic of your interest and after five years you will be a master of that specific topic. So, I started with exposing myself to business related information: books, twitter streams, magazines etc. During the summer I read three books whose titles I would like to share with you.</p> <ul class="simple"> <li><cite>Getting Real: The smarter, faster, easier way to build a successful web application</cite> by 37signals.</li> <li><cite>REWORK</cite> by Jason Fried and David Heinemeier Hansson.</li> <li><cite>Linchpin: Are you indispensable?</cite> by Seth Godin.</li> </ul> <p>All these books are business books. Let's start with Getting Real and REWORK. REWORK actually is a revamped and extended version of Getting Real. If you are planning to buy one of these get REWORK. In REWORK authors talk about common mistakes that startups do. If you are familiar with 37signals practices you know that they go for simplicity and clarity in both their applications and business environment. If you are like me and know nothing about business you must read it -- it will completely change the way you think about starting a business (especially web/software). I was working for a startup this summer and was able to see some of the mistakes that were covered in REWORK. For example, we had meetings twice a week where we had to update our status because there was some higher power involved in the company. The length of a meeting ranged from one to three hours which is a complete waste of time, and REWORK nicely explains why &quot;meetings are toxic&quot;. Read this book! Linchpin is another great book. Its main purpose is to make the reader unleash the artist within. As the author says: we've been brainwashed. We are constantly being brainwashed by our friends, family, school, colleagues who say that we have to follow the rules in order to secure our future. The book tries to engrave the exact opposite on the reader's mind. In order to secure your future you have to become an artist: you have to take an extra step that makes your work flawless or brings joy to people. The reason for taking this step is that it's the only thing that keeps you in your job or makes you a competitor of huge companies who have power to deliver goods fast and at low cost. Linchpin is a very inspirational book. I hope you will be able to read these books. They are really worth your time. I have a few more books on my reading list which I am planning to cover here after I finish reading them. I also found this very interesting site called StartupDigest. If you subscribe you will receive an email once in a while with upcoming startup events in your area. Maybe I will meet you in one of these! :)</p> Creating a personal virtual machine for code obfuscation purposes http://tadas.vilkeliskis.com/2010/1/4/creating-a-personal-vm 2010-01-04T00:00:00Z Tadas Vilkeliskis <p>When talking about the protection of intellectual property, virtual machines and custom instruction set can play a very important role in the field. One of the ways to protect your algorithm from curious eyes is to use code obfuscation techniques. These can range from a simple instruction reordering to a more sophisticated control flow modifications and added layer of custom instruction set, or a combination of both. Recently I've watched a video about virtual machines for code obfuscation from <a class="reference external" href="http://recon.cx/">RECON</a> video <a class="reference external" href="http://www.archive.org/details/RECON2008">archives</a>. The speaker said that his implementation is available for download at his website for those who want to experiment; however, I could not find it so I've decided to implement my own compiler and virtual machine (vm).</p> <p>The code of the compiler and testing version of the vm is available for download from <a class="reference external" href="http://github.com/tadasv/vms/">github</a>. I am not going to explain how they are implemented but I will explain how one can create his custom instruction set by extending the compiler and how to write code that compiler understands.</p> <p>The compiler was written in python because it makes the work with text and strings trivial. There are three files: compiler.py, vmc.py and myis.py. The compiler.py is basically the class that provides all the logic: lexical analysis, compiling, linking, etc. The parser in the compiler does not support comments. So, if you are going to write anything longer than my factorial example you may get lost in the code. The compiler only supports 7 types of tokens where each token is separated by a white space:</p> <ul class="simple"> <li><cite>instruction</cite> -- an instruction (see <tt class="docutils literal">myis.py</tt> instructions variable).</li> <li><cite>register</cite> -- a register (see <tt class="docutils literal">myis.py</tt> registers variable).</li> <li><cite>&#64;register</cite> -- a register that holds memory address and will be dereferenced by the virtual machine.</li> <li><cite>immediate</cite> -- an immediate value. Syntax: <tt class="docutils literal"><span class="pre">int(&lt;number&gt;)</span></tt> where <cite>&lt;number&gt;</cite> can be a number in any format supported by python.</li> <li><cite>label</cite> -- a label. Syntax <tt class="docutils literal">&lt;name&gt;</tt>:</li> <li><cite>reference</cite> -- any name that does not fall under the instruction and register categories. Usually a label without a colon.</li> <li><cite>string</cite> -- a sequence of bytes. Syntax <tt class="docutils literal"><span class="pre">str(&lt;string&gt;)</span></tt> where <cite>&lt;string&gt;</cite> is any string supported by python.</li> </ul> <p>A sample program that computes factorial can be found at github with the rest of the code; the syntax should become more clear by looking at it. However, you should be aware when using <cite>int()</cite> and <cite>str()</cite>; whatever will be inside the parenthesis is going to be evaluated by the python interpreter. Furthermore, <cite>int()</cite> and <cite>str()</cite> cannot take an argument with white space. For example, instead of <tt class="docutils literal"><span class="pre">str(&quot;hello</span> world&quot;)</tt> you should use <tt class="docutils literal"><span class="pre">str(&quot;hello\x20world&quot;)</span></tt>.</p> <p>First let's look how one can extend the instruction set by modifying <tt class="docutils literal">myis.py</tt>.</p> <div class="highlight"><pre><span class="n">registers</span> <span class="o">=</span> <span class="p">[</span><span class="s">&quot;eax&quot;</span><span class="p">,</span> <span class="s">&quot;ebx&quot;</span><span class="p">,</span> <span class="s">&quot;ecx&quot;</span><span class="p">,</span> <span class="s">&quot;edx&quot;</span><span class="p">,</span> <span class="s">&quot;esp&quot;</span><span class="p">,</span> <span class="s">&quot;ebp&quot;</span><span class="p">,</span> <span class="s">&quot;esi&quot;</span><span class="p">,</span> <span class="s">&quot;edi&quot;</span><span class="p">]</span> <span class="n">instructions</span> <span class="o">=</span> <span class="p">{</span> <span class="s">&quot;push&quot;</span> <span class="p">:</span> <span class="p">[{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x00</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;cc&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">]}],</span> <span class="s">&quot;pop&quot;</span> <span class="p">:</span> <span class="p">[{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x01</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;cc&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">]}],</span> <span class="s">&quot;mov&quot;</span> <span class="p">:</span> <span class="p">[{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x02</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;ccI&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">,</span> <span class="s">&quot;imm&quot;</span><span class="p">]},</span> <span class="p">{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x03</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;ccc&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">,</span> <span class="s">&quot;reg&quot;</span><span class="p">]},</span> <span class="p">{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x04</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;ccc&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">,</span> <span class="s">&quot;@reg&quot;</span><span class="p">]},</span> <span class="p">{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x05</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;ccc&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;@reg&quot;</span><span class="p">,</span> <span class="s">&quot;reg&quot;</span><span class="p">]},</span> <span class="p">{</span><span class="s">&quot;opcode&quot;</span> <span class="p">:</span> <span class="mh">0x06</span><span class="p">,</span> <span class="s">&quot;format&quot;</span> <span class="p">:</span> <span class="s">&quot;&lt;ccI&quot;</span><span class="p">,</span> <span class="s">&quot;params&quot;</span> <span class="p">:</span> <span class="p">[</span><span class="s">&quot;reg&quot;</span><span class="p">,</span> <span class="s">&quot;ref&quot;</span><span class="p">]}</span> <span class="p">],</span> <span class="o">...</span> <span class="p">}</span> </pre></div> <p>Above you can see an excerpt from myis.py which contains two variables: registers and instructions. The registers variable is a list containing all valid registers that tokenizer will recognize. If the order of registers is changed the compiler will generate different instruction opcodes because it refers to registers by their position in the array. The instruction variable is a map table (dictionary in python terms) where each keyword is a name of an instruction mapping to a list that contains one or more dictionaries. Each inner dictionary has three keys:</p> <ul class="simple"> <li><cite>opcode</cite> -- an opcode number.</li> <li><cite>format</cite> -- an instruction encoding format. See python struct.</li> <li><cite>params</cite> -- a list of parameters (in exact order) the instruction takes. Valid values are: <cite>reg</cite>, <cite>&#64;reg</cite>, <cite>ref</cite>, <cite>imm</cite>, <cite>str</cite>. See tokens above for more details.</li> </ul> <p>Sometimes an instruction can have multiple dictionaries as mov does in the above example. This should be only the case when instruction can take different set of arguments; then the compiler will iterate through till it finds the right match. There is also a special instruction defined in the myis.py - emit. The emit takes a string as an argument and emits bytes into the code. This can be useful if you want to make code overlap, reserve space for variables, or just manually inject bytes in the code. When using references as arguments the address of the reference will be translated to an offset relative to the instruction pointer. Simply by modifying myis.py you can introduce new instructions to the compiler; then write the code and compile it with vmc.py.</p> <p>Once you have the bytecode the final step is to write a vm that will read it and execute the encoded instructions. Among the source files I have included a very basic sample vm that will execute the factorial program. The implementation is pretty straightforward.</p> <p>Hopefully I have covered everything and didn't miss any parts of the compiler's logic. The source code I provide is highly experimental and it should have tons of bugs; nevertheless, it should give a good starting base to those who would like to write their own compiler and a vm.</p>