Rust for closed-source projects

I’ve been playing with Rust for a while now. With a few thousands lines of Rust code under my belt I can’t by any means claim to be an expert, but I’ve formed a few opinions on it. I mostly like its syntax very much. It’s refreshing compared to many other languages out there. I also consider Rust a complex language. While I can easily develop code in Java, which is what I’m currently doing, without having ever written a line of code in it, Rust is different. Just like with your first book on C or C++, you have to actually learn Rust.

So why did I start looking into Rust? Well, in the beginning it was just curiosity and wanting to learn something new and I had to choose between Go and Rust. I first looked into Go, took the official language tour and had understood its syntax in about 1-2 hours. That, of course, doesn’t mean I mastered it, but I was ready to program something in it. And that’s also the key strength of Go: you can learn programming in it right away, it’s extremely simple.

Then I started studying Rust and after half a day spent on a train reading parts of the Rust book, I ported a small CHIP-8 emulator to it in one day. Just a small exercise to become more familiar with the language. Throughout the following month I continued reading the Rust book and even re-reading chapters, in order not to forget things. It is necessary to use a language in order to remember it: this is true both for programming languages as for spoken ones. And the more complex a language is, the easier it is to forget it and Rust has a learning curve which is steeper than Go or C#.

I would say that a C++ developer can easily program in any C-dialect (Java, C#, Go, JS, etc.) and can easily learn languages such as Python or Ruby. Rust is a paradigm shift and takes longer to learn.

About two months ago, I had to start the development of a command-line project and needed a decent standard library. The end-result had to be a native, statically linked executable. Realistically, the possibilities were: C/C++, Rust or Go. For C++ the STL was out of question, because I think it’s terrible and anyway lacks many features which are essential. Qt is a beautiful C++ library and I use it gladly whenever I can, but static linking would require a commercial Qt license. On top of that, compiling on multiple OSs would require a rebuild of the static Qt library on each OS, so a lot of extra work.

But apart from all this, there was also the fact that the project in question was boring to develop and I wanted it to be fun. So while I briefly considered Go, I went almost immediately with Rust. Go is not just simple, it’s simplified. It lacks many important constructs present in other languages for the purpose of simplicity. While the simplicity of Go can be refreshing and has its charm, I found myself naturally gravitating towards the complexity of Rust.

To be clear, I don’t like unnecessary complexity and that’s why I am not using most of the new features to be found in C++. Complexity has to be kept simple. Rust set a number of goals for itself and some of them are complex to solve without a garbage collector. Within the complexity which arises from these goals, it needs to keep things as simple as possible.

Having said all that, Rust is not yet a mature language in many ways and can’t be used for just any project like C/C++. Its library is less rich than that of Go, some its standard library does have, in my opinion, an odd syntax and compiling can be really slow.

What I wrote in the CHIP-8 post was:

I can’t yet write something exhaustive about Rust, because I’m still learning it. What I can say up until now is that, apart some minor things which I dislike (snake-case: ugh), it seems fun to program in it. The amount of rules make the programming a bit more challenging, but it pays off in satisfaction once everything builds without complaints.

The only thing I can say is that I can’t clearly see a use-case for Rust. Yes, it’s a solid and secure language which is fun to use. But will it be used in the real world? I can see many use-cases for Go, not so many for Rust. What I hope is for Rust to mature some more and then to become stable, without going down the path to insanity like modern C++.

I must say I changed my mind. I definitely see a future for Rust, because if there are enough talented programmers who think it’s fun to program in it, it will grow. That’s a safe bet. The only thing it must avoid is to have people implementing useless features in it, the way it is being done in C++, just for their academic score. But before that happens Rust will definitely flourish.

One of the aspects about Rust in connection to closed-source projects which needs to be mentioned is that there’s a lot of debug information inside of a Rust executable, even in release mode. Every panic! in Rust prints out a lot of metadata.

Let’s take for instance this small sample I created:

It will print out:

By setting RUST_BACKTRACE to 1, it’s even worse:

I found these privacy issues related to closed-source projects being mentioned in this thread on GitHub, but I didn’t find any ready-to-use solution.

So the obvious and only solution is to modify the Rust compiler and that’s exactly what we’re going to do. While I’m describing how to do this on Windows, the parts not related to the build process are valid on Unix as well.

I’d like to mention that, in order to avoid this hassle, I briefly looked into Go to check how much metadata was to be found in Go binaries. The answer is: a lot. And it’s even way worse than in Rust, because Go has reflection and patching that out of the compiler is way more difficult and may break lots of stuff.

Another reason worth mentioning why Go was a no-go is that at least on Windows the capability of Go to call C-code via CGo requires the mingw compiler and that leads to a whole new set of problems.

The first step to build the Rust compiler is to download the source. You can do so either from the website or from GitHub. Then you need Visual Studio 2017 or above. The README says 2013 or above but I found another part of the documentation mentioning 2017 or above and I had difficulties building with Visual Studio 2013. The community edition of Visual Studio is more than enough. I used Visual Studio 2017.

Extremely important, however, are the packages of Visual Studio you need to install. Initially, since I didn’t think I needed many, I limited myself to the essential and got strange build errors. It took me a _lot_ of time to understand that I needed to install certain additional packages in Visual Studio. If you’re the kind of guy who just installs everything which comes with Visual Studio, then you’re good to go. But if you’re more like me and want to limit the installation size, here’s the essential packages you absolutely need to install in order not to anger the gods:

Build instructions can be found in the README. The relevant part for us is:

The build triple I used is ‘i686-pc-windows-msvc’, because I needed the application to be 32-bit, in order to maximize compatibility. What I did is to copy the ‘config.toml.example’ in the main directory to ‘config.toml’ and modify parts of it.

The following are the parts I modified:

rustc is a bootstrapping compiler, which means that it uses itself to build itself. There are 3 build stages called stage0, stage1 and stage2. Only at stage1 the sources in our directory are used. The resulting compiler then builds itself again in stage2. This process is described in detail on this page.

Finding the panic! macro is very easy: it’s inside src/libcore/macros.rs.

While the first instinct would be to patch this macro, if we look at what is called inside of it, we can see it calls the macros file!, line! and __rust_unstable_column!. These macros are defined in the same file:

Unfortunately, they are built-in. However, patching out these macros is much better than modifying the panic! macro, as it solves the issue at its roots and prevents these macros from generating metadata elsewhere.

So I searched for the “column” word in the whole source tree and after a bit of inspection finally got to the location where built-in macros are expanded, which is in src/libsyntax/ext/source_util.rs.

So I patched out the relevant parts:

After these changes we can open the Visual Studio command prompt and compile by entering:

The compile process will take a while. If you have many cores, you can try to speed it up by changing relevant parts in the config.toml file. It can also happen that the build ends with some strange error. This may happen if you’re compiling for 32-bit and LLVM exhausts memory. The documentation mentions this. It’s not a big issue, just relaunch the build command and the build process will continue from where it left off. It never happened to me that I had to rebuild more than once.

If the build ends successfully, you should end up with a rustc compiler in build/i686-pc-windows-msvc/stage2/bin. I didn’t find any cargo.exe in that directory, so I just copied the one from the official installation into it.

I then prepared a batch file to launch the Visual Studio command prompt for the correct Rust version:

And compiled the release of the test binary via:

The output now is:

If we set RUST_BACKTRACE, the result will be the same.

After inspecting the executable we can see that there is still some metadata left in the shape of some absolute paths, such as:

All the paths I could find were related to the path of the compiler and not that of the project. If you’re bothered by them, it’s easy to write a simple Python script to zero them out as a post-build step.

Now we could be ready, save for the fact that the libc wasn’t linked statically into our executable. If we take a look at the import table, we can see the ugly imports produced by newer versions of Visual Studio.

To solve this we need to invoke rustc like this:

I found the relevant documentation for this here. But we want to specify this flag for cargo. We can achieve this by setting the environment variable RUSTFLAGS:

So I modified my batch script like so:

Now after the build process, we end up with a bigger executable and no external dependencies apart from kernel32. Perfect!

At this point we only have to strip the debug directory from the PE. We can do this by using a simple script for Cerbero Suite or CFF Explorer. To be honest for my programs I still use a CFF Explorer script and never bothered writing one for Cerbero.

You can call this script fixdbg.cff and launch it directly as the cff extension is associated to CFF Explorer. This can be arranged as a post-build step.

Let’s finish this nicely by maximizing compatibility. Now that we have a clean, statically-linked executable, we can try to make it run on XP. We just need to modify some fields in the Optional Header of the Portable Executable.

We modify these fields as follows:

MajorOperatingSystemVersion: 5
MinorOperatingSystemVersion: 0
MajorSubsystemVersion: 5
MinorSubsystemVersion: 0

And now it’s time to try…

We have a stripped Rust executable built with the latest stable Rust compiler and Visual Studio 2017 running on Windows XP!

Porting a CHIP-8 emulator to Rust

I’ve been meaning to learn the Rust language for quite some years and found only now the time to start this endeavor. I must say it has probably been for the best, as the language has clearly matured a lot since the last time I looked into it.

As a first project to try out Rust I ported Laurence Muller’s CHIP-8 emulator to it. It’s a simple C++ project and it took me only a day to port it to Rust.

You can download my port from GitHub.

There’s not much to write about the project itself apart that the original code used GLUT and the port uses SDL2. I also implemented basic audio support, but didn’t work on providing a realistic clock speed.

I can’t yet write something exhaustive about Rust, because I’m still learning it. What I can say up until now is that, apart some minor things which I dislike (snake-case: ugh), it seems fun to program in it. The amount of rules make the programming a bit more challenging, but it pays off in satisfaction once everything builds without complaints.

The only thing I can say is that I can’t clearly see a use-case for Rust. Yes, it’s a solid and secure language which is fun to use. But will it be used in the real world? I can see many use-cases for Go, not so many for Rust. What I hope is for Rust to mature some more and then to become stable, without going down the path to insanity like modern C++.

Batch image manipulation using Python and GIMP

Not a very common topic for me, but I thought it could be neat to mention some tips & tricks. I won’t go into the details of the Python GIMP SDK, most of it can be figured out from the GIMP documentation. I spent a total of one hour researching this topic, so I’m not an expert and I could have made mistakes, but perhaps I can save some effort to others which want to achieve the same results. You can jump to the end of the tutorial to find a nice skeleton batch script if you’re not interested in reading the theory.

To those wondering why GIMP, it’s because I created a new icon for Profiler and wanted to automatize some operations on it in order to have it in all sizes and flavors I need. One of the produced images had to be semi-transparent. So I thought, why not using a GIMP batch command, since anyway GIMP is installed on most Linux systems by default?

Just to mention, GIMP supports also a Lisp syntax to write scripts, but it caused my eyes to bleed profusely, so I didn’t even take into it consideration and focused directly on Python.

Of course, I could’ve tried other solutions like PIL (Python Imaging Library) which I have used in the past. But GIMP is actually nice, you can do many complex UI operations from code and you also have an interactive Python shell to test your code live on an image.

For example, open an image in GIMP, then open the Python console from Filters -> Python-Fu -> Console and execute the following code:

And you’ll see that the image is now halfway transparent. What the code does is to take the first image from the list of open images and sets the opacity of the first layer to 50%.

This is the nice thing about GIMP scripting: it lets you manipulate layers just like in the UI. This allows for very powerful scripting capabilities.

The first small issue I’ve encountered in my attempt to write a batch script, is that GIMP only accepts Python code as command line argument, not the path to a script on disk. According to the official documentation:

GIMP Python All this means that you could easily invoke a GIMP Python plug-in such as the one above directly from your shell using the (plug-in-script- fu-eval …) evaluator:

gimp –no-interface –batch ‘(python-fu-console-echo RUN-NONINTERACTIVE “another string” 777 3.1416 (list 1 0 0))’ ‘(gimp-quit 1)’

The idea behind it is that you create a GIMP plugin script, put it in the GIMP plugin directory, register methods like in the following small example script:

And then invoke the registered method from the command line as explained above.

I noticed many threads on stackoverflow.com where people were trying to figure out how to execute a batch script from the command line. Now, the obvious solution which came to my mind is to execute Python code from the command line which prepends the current path to the sys.path and then to import the batch script. So I searched and found that solution suggested by the user xenoid in this stackoverflow thread.

So the final code for my case would be:

And:

What took me most to understand was to call the method merge_visible_layers before saving the image. Initially, I was trying to do it without calling it and the saved image was not transparent at all. So I thought the opacity was not correctly set and tried to do it with other methods like calling gimp_layer_set_opacity, but without success.

I then tried in the console and noticed that the opacity is actually set correctly, but that that information is lost when saving the image to disk. I then found the image method flatten and noticed that the transparency was retained, but unfortunately the saved PNG background was now white and no longer transparent. So I figured that there had to be a method to obtain a similar result but without losing the transparent background. Looking a bit among the methods in the SDK I found merge_visible_layers. I think it’s important to point this out, in case you experience the same issue and can’t find a working solution just like it happened to me.

Now we have a working solution, but let’s create a more elegant one, which allows use to use GIMP from within the same script, without any external invocation.

We can now call our function simply like this:

Which looks very pretty to me.

I could go on showing other nice examples of image manipulation, but the gist of the tutorial was just this. However, GIMP has a rich SDK which allows to automatize very complex operations.

Ctor conflicts

Perhaps the content of this post is trivial and widely known(?), but I just spent some time fixing a bug related to the following C++ behavior.

Let’s take a look at this code snippet:

The output of the code above is:

Whether we compile it with VC++ or g++, the result is the same.

The problem is that although the struct or class is declared locally the name of the constructor is considered a global symbol. So while the allocation size of the struct or class is correct, the constructor being invoked is always the first one encountered by the compiler, which in this case is the one which prints ‘apple’.

The problem here is that the compiler doesn’t warn the user in any way that the wrong constructor is being called and in a large project with hundreds of files it may very well be that two constructors collide.

Since namespaces are part of the name of the symbol, the code above can be fixed by adding a namespace:

Now the correct constructor will be called.

I wrote a small (dumb) Python script to detect possible ctor conflicts. It just looks for struct or class declarations and reports duplicate symbol names. It’s far from perfect.

In my opinion this could be handled better on the compiler side, at least by giving a warning.

ADDENDUM: Myria ‏(@Myriachan) explained the compiler internals on this one on twitter:

I’m just surprised that it doesn’t cause a “duplicate symbol” linker error. Symbol flagged “weak” from being inline, maybe? […] Member functions defined inside classes like that are automatically “inline” by C++ standard. […] The “inline” keyword has two meanings: hint to compiler that inlining machine code may be wise, and making symbol weak. […] Regardless of whether the compiler chooses to inline machine code within calling functions, the weak symbol part still applies. […] It is as if all inline functions (including functions defined inside classes) have __declspec(selectany) on them, in MSVC terms. […] Without this behavior, if you ever had a class in a header with functions defined, the compiler would either have to always inline the machine code, or you’d have to use #ifdef nonsense to avoid more than one .cpp defining the function.

The explanation is the correct one. And yes, if we define the ctor outside of the class the compiler does generate an error.

The logic mismatch here is that local structures in C do exist, local ctors in C++ don’t. So, the correct struct is allocated but the wrong ctor is being called. Also, while the symbol is weak for the reasons explained by Myria, the compiler could still give an error if the ctor code doesn’t match across files.

So the rule here could be: if you have local classes, avoid defining the ctor inside the class. If you already have a conflict as I did and don’t want to change the code, you can fix it with a namespace as shown above.

Software Theft FAIL

… Or why stealing software is stupid (and wrong). A small guide to detect software theft for those who are not reverse engineers.

Under my previous post the user Xylitol reported a web-page (hxyp://martik-scorp.blogspot.com/2010/12/show-me-loaded-drivers.html) by someone called “Martik Panosian” claiming that my driver list utility was his own.

Now, the utility is very small and anybody who can write a bit of code can write a similar one in an hour. Still, stealing is not nice. 🙂

Since I can’t let this ignominious theft go unpunished :P, I’ll try at least to make this post stretch beyond the specific case and show to people who don’t know much about these sort things how they can easily recognize if a software of theirs has been stolen.

In this specific case, the stolen software has been changed in its basic appearance (title, icon, version information). It can easily be explored with a software such as the CFF Explorer. In this case the CFF Explorer also identifies the stolen software as packed with PE Compact. If the CFF Explorer fails to recognize the signature, it’s a good idea to use a more up-to-date identification program like PEiD.

However, packing an application to conceal its code is a very dumb idea. Why? Because packers are not meant to really conceal the code, but to bind themselves to the application. What is usually difficult to recover in a packed application is its entry-point, the IAT and other things. But the great majority of the code is usually recoverable through a simple memory dump.
Just select the running application with an utility such as Task Explorer, right click to display the context menu and click on “Dump PE”.

Now the code can be compared. There are many ways to compare the code of two binaries. One of the easiest is to open it with IDA Pro and to use a binary diffing utility such as PatchDiff2. If the reader is doing this for hobby and can’t afford a commercial license of IDA Pro, then the freeware version will do as well.

Just disassemble both files with IDA Pro and save one of the idbs. Then click on “Edit->Plugins->PatchDiff2” and select the saved idb.

Let’s look at a screenshot of the results:

Click to enlarge

As it is possible to see, not only were the great majority of functions matched, but they also match at the same address, which proves beyond doubt that they are, in fact, the same application.

It’s important to remember that a limited number of matches is normal, because library functions or some basic ones may match among different applications.

A comparison of two applications can even be performed manually with IDA Pro, just by looking at the code, but using a diffing utility is in most cases the easiest solution.