Showing posts with label en. Show all posts
Showing posts with label en. Show all posts

26 July 2016

Automated builds for C++ projects via GitLab CI

New address: MKlimenko.github.io

First of all: GitLab is an open-source git service. About a year ago, when I asked for any type of source control admins gave me an access to our git server. For a long time I've used only a few of its features, just straightforward git pushes to save my work. However, things were about to change very soon. Currently I'm working on a very complicated project, which involves multiple types of projects (DSP, ARM, API, UI), which requires a lot of SDK’s and IDE’s to build. And when we were forced to wait for three hours to install Qt (hello to corporate proxies), I’ve decided that it would be great to make a dedicated build server, which will pull every new commit from git and handle all the builds. Brief search told me about Jenkins, but I was too lazy to set this thing up. Later that day I’ve visited our corporate git webpage and one menu entry has caught my attention: “Builds”. Simply put it’s an integrated build tasker, which engages when new commit is pushed.

To engage the CI you only have to put a file into the root of the repository. There are many possibilities, like perform B only if A fails/succeeds, etc. I’m not so good at it and just update all of the submodules prior to the builds (to acquire the latter libraries), and then perform the builds. If the build job fails, the last pusher is notified that he broke the build. When I was testing this feature, I was receiving something around 50 emails an hour.


There are many examples of .gitlab-ci.yml files for popular languages, but I was unable to find an example of Visual Studio and Qt projects builds. Maybe this will help you:
MSBuild:

 Job_name:  
  script:  
  - 'setlocal'  
  - 'chcp 65001'  
  - 'call "%VS120COMNTOOLS%..\..\vc\vcvarsall.bat" x86_amd64'  
  - 'msbuild.exe make\vs120\Project_name.sln /t:Rebuild /p:Configuration=Release /p:Platform="x64" /m'  
  - 'if not exist "%BUILDS%\Project_name" (mkdir "%BUILDS%\Project_name")'  
  - 'copy make\vs120\x64\Release\Project_name.exe "%BUILDS%\Project_name"'  

“Job_name:” string is required and this is how your build/test job will be displayed on the server website.

“  - 'chcp 65001'” is required to display correctly the Cyrillic symbols from MSBuild output.

“  - 'call "%VS120COMNTOOLS%..\..\vc\vcvarsall.bat" x86_amd64'” adds required environment variables, to help shell executor to find MSBuild, C++ compiler etc.

The last two strings help me to store the latter build in the shared folder on the server. It helps a lot when it comes to sharing some of the software I develop.

Easy, huh? Let’s switch to something more interesting, Qt projects!

Another_Job_name:  
  script:  
  - 'setlocal'  
  - 'chcp 65001'  
  - 'call "%VS120COMNTOOLS%..\..\vc\vcvarsall.bat" x86_amd64'  
  - 'cd make\qt5'  
  - 'call "%QT_ROOT_x86_64%\bin\qmake.exe" Qt_project.pro -r -spec win32-msvc2013'  
  - 'call "%QT_CREATOR%\bin\jom.exe" -f Makefile.Release'   
  - 'rd /s/q deploy'  
  - 'mkdir deploy'  
  - 'copy release\Qt_project.exe deploy'  
  - 'set curr_dir=%cd%'  
  - 'cd /d "%QT_ROOT_x86_64%\bin"'  
  - 'windeployqt.exe "%curr_dir%\deploy\Qt_project.exe" -no-translations'  
  - 'cd /d %curr_dir%'  
  - 'if not exist "%BUILDS%\Qt_project" (mkdir "%BUILDS%\Qt_project")'  
  - 'xcopy /s /y deploy "%BUILDS%\Qt_project"'  

Ah, now that’s interesting. Several similar commands, then we call qmake to create Makefiles from .pro, and then jom (multithreaded make) builds the Release version.

Qt application requires a lot of libraries to run, so we need them all to be present in the final folder. There’s a tool called windeployqt, which analyzes the executable and puts everything right next to it. Somewhy I wasn’t able to make it work without changing the folder, but who cares. xcopy is used to copy everything inside the internal deploy folder to The deploy folder.


It took me a while to realize how to call qmake, jom and windeployqt, but again, nothing too difficult. So I present to you the ARM project build script:

 ARM_project_job:  
  script:  
  - 'setlocal'  
  - 'chcp 65001'  
  - 'set command=""%DS-5_DIR%\sw\eclipse\eclipsec.exe" -nosplash --launcher.suppressErrors -application org.eclipse.cdt.managedbuilder.core.headlessbuild -data "%ARM_WORKSPACE%" -import make\eclipse\arm_project -cleanBuild arm_project"'   
  - 'echo "%command%" | "%DS-5_DIR%\bin\cmdsuite.exe" 2> error.txt'  
  - 'for %%A in (error.txt) do set fileSize=%%~zA'  
  - 'del /f /q error.txt'   
  - 'if not %fileSize%==0 (exit /b 1)'  
  - 'if not exist "%BUILDS%\arm_project" (mkdir "%BUILDS%\arm_project")'  
  - 'copy make\eclipse\arm_project\Release\arm_project.axf "%BUILDS%\arm_project"'  

The interesting part here is the piping the %command% to the cmdsuite.exe. Cmdsuite is a DS-5 command prompt, batch job with some internal magic about licensing and configuring databases. I call it magic I failed at trying to export all of the environment variables to the system. The problem is how to pass the command to the batch job? Somehow piping works. The command itself is just a call for the eclipse without logo, ordering it to import the acquired project and build it in headless mode. Attention! If you have a project with the same name imported into your DS-5 workspace on that PC, eclipse won’t be able to import it and, therefore, to build it. Then I redirect the output of the build to the text file and read it later in the main thread (command prompt). This is required, because the batch job output is suppressed and isn’t available for gitlab CI to analyze. If the file is not empty, I presume that there are errors and exit with error code 1, which marks the build as failed. Otherwise, I have the latest build ARM executable in the shared folder.                 

8 July 2016

Managing threads in Qt

I was writing a little post about performing a remote power reset, but it is not quite there yet, so I’ll share with you some of the ideas about multithreading in UI applications. Let’s say we’re developing a simple client-server application that is required to use a callback function every time there is new data available. Therefore, we need to listen the port for the incoming data constantly, which is impossible in the main thread. Here comes the multithreading: we have to spawn a listener thread, while our main thread is waiting for the instructions.

There are two ways to create a parallel application in C++: thread-based and task-based. std::thread vs std::async. I personally prefer the std::thread way, simply because std::async need the std::launch::async policy specified for the true asynchronous calculations.

For example, let’s use a class:

class Foo {
private:
       std::thread RxThread;
       //…
       void InfiniteRead(std::function<void(uint8_t*, size_t &)> callback) {
             for (;;) {
                    //Read Some Data
                    if(read) {
                           callback(data, size_of_data);
                    }
             }
       }
       //…
public:
       void Init() {
             RxThread = std::thread(&Foo::InfiniteRead, this, callback);
       }
}

Voila! We’ve spawned a thread, which constantly reads data and calls some function on received data. Problems start when we need to update the UI according to the received data. It is a bad habit to change anything in the UI from other threads, because you never know if it’s being updated from the main thread right now.

One should use events in Qt to update the UI from (sort of) another thread. Firstly you should declare a class, which would handle the events:

class MyEvent : public QEvent{
public:
    struct event_msg{
        //some custom struct with data
    };

  MyEvent(const event_msg& message) : QEvent(QEvent::User) {_message = message;}
 ~MyEvent() {}

  event_msg message() const {return _message;}

private:
  event_msg _message;


};

Then you declare a simple method in the UI header file:

bool event(QEvent* event);

With the following implementation:

bool UI_Class::event(QEvent* event){
    if (event->type() == QEvent::User){

        MyEvent* postedEvent = static_cast<MyEvent*>(event);
        //some code
    }
    return QWidget::event(event);
}

In the overridden event method, we check the event type, and if it is the one we’ve made, we execute some commands. Otherwise, we send the event further down the food chain. And the last thing is how to send such events:

    void blabla(){
        ...
        MyEvent::event_msg event;
        //fill event_msg with data
        MyEvent* e = new MyEvent(event);
        QCoreApplication::postEvent(parent, e);
    }


Done. As simple as that. Also, it’s very good idea to notify the thread via the std::condition_variable, that you’ve received some data. 

27 May 2016

Digital Signal Visualization

We live in the digital era. Even if you have no idea what is the difference between RAM and the CPU, or between GSM and GPS, you have to accept the fact that the vast majority of the surrounding things have something digital inside. It may be either a simple RFID chip on a pack of milk or public transport ticket, or a complicated device hidden in a beautiful enclosure.

Everything is great from a consumers point of view, because the devices are getting smarter, smaller and cheaper. Simultaneously with the technology advancement new ideas appear, simplifying our daily routine. We, the engineers, convert those ideas into real devices.

When there are real devices there are real signals, which need to be viewed, analysed, compared etc. There are several libraries made for this purpose (gnuplot, for example), but they're too complicated for me and overloaded with possibilities. At my work we have an internal library for building one-dimensional signals and it's one hell of a software. It's doing just what you want from it and a bit more. It's great at serving one exact purpose: plotting the signal.

Once I've had an interview and I was asked about any experience in analysing the signal in the binary form and I was genuinely confused by this question: why would I want to analyse it this way, when it's much easier to plot it and view?

And that's why I've decided to make an application based on our internal library. It does one thing: displays the file you've specified with the parameters you've mentioned. For example, here's the result of passing two files with different data structures (complex vs real, 16-bit vs 8-bit) to the program:


Casting to the required type of data is pretty simple, program reads file as an array of bytes (uint8_t actually), and then the pointer to this array passed to the library via the reinterpret_cast. 

One problem isn't solved yet: dealing with packed signals. GNSS signals require only 1 or 2 bits to be efficiently quantized, so why would one want to spend one byte per every sample? Several times I've seen signals, which are packed beyond any reasonable point. It was a signal from GPS/GLONASS L1+L2 receiver and every single byte looked like this (separated to bits):

|GPS L1 Inphase|GPS L1 Quadrature|GLN L1 Inphase|GLN L1 Quadrature|GPS L2 Inphase|GPS L2 Quadrature|GLN L2 Inphase|GLN L2 Quadrature|

For now I can't figure out a way for user to tell the type of packaging to the application. There are some ways, but they're pretty complicated and I don't want to use them. So for now you have to unpack the signal before you visualize it.

Anyway, if you've managed to read this far, you're either interested in this application or one of my friends I've bothered beyond any reasonable level. I'm not pretty sure if I'm allowed by the corporate etics to give the direct link, but if you need it for non-commercial use, feel free to contact me via the e-mail, which you may find at the right side of this blog.

24 April 2016

Application processing in GNSS

Most modern GNSS receivers share a similar architecture:
  1. Antenna & LNA;
  2. Front end;
  3. Baseband processing;
  4. Application processing.
In hardware it's usually implemented as a separate antenna device, front-end IC for down-conversion and primary filtering, ADCs and an ASIC. ASIC consists of multiple (nowadays up to several hundreds!) channels with correlators, heterodynes, NCOs and so on. Also it may or may not contain a general-purpose processor unit, such as ARM or PowerPC. As far as I know, there are no solutions using the x86 (due to license fees or the power requirements, who knows), but I'd love to create a receiver based on an Edison or something like that.

Application processing is a huge field in GNSS development and it's being used for such a things:
  1. Locked loops discriminators with feedback to the satellite channels. This is the heart of tracking;
  2. Calculating the PVT from the raw pseudoranges and pseudophases
  3. Monitoring the GNSS signal integrity and so on
Lately I've been developing a tool for ARM which prepares DSPs to work and launches them. When you have to prepare bare-metal hardware to work almost every time it's required to read/write to some registers, pull some GPIO pins and so on. 

Let's have a look at an abstract SoC. For example, we have 4 ADCs (GPS L1, GLN L1, GPS L2, GLN L2), and we want to start only two of them. Now we go to the manual and read that to enable ADC #1 and #3 we have to take a 32-bit register, and set the first and the third bit in it. 

 uint32_t* start_adc_ptr = reinterpret_cast<uint32_t*>(0xfff88000);  
 start_adc_ptr[0] = 0xA;  

Or even worse:

 *reinterpret_cast<uint32_t*>(0xfff88000) = 0xA;  

Why it's bad? Because it's not clear why the hell would I want to write an 0xA at some memory address (Here I'd like to greet my good friend, who is a C# developer and who literally turns grey when I speak about such "unsafe" things).

Is there a way to improve it? Sure, one may add a nice comment, something like this:

 //ADC start control registers  
 uint32_t* start_adc_ptr = reinterpret_cast<uint32_t*>(0xfff88000);   
 //Start ADC 1 and 3: 0000_0000_0000_0000_0000_0000_0000_1010 = 0xA  
 start_adc_ptr[0] = 0xA;   

Well ok, now it's clear and good to create, test and run like there's no tomorrow. This is ok when someone will support your projects in five years time, but it becomes more and more complicated if there's a need to modify or update the code. So you have to add some more bits, somehow convert it to hex value.

The solution is a std::bitset container. It's used to implement an integer number (or a std::string like "00001010") as an array of bits. So now if we need to modify our code it's easier to do this way:

 #include <bitset>  

  //ADC start control registers   
  uint32_t* start_adc_ptr = reinterpret_cast<uint32_t*>(0xfff88000);    
  //Start ADC 1 and 3: 0000_0000_0000_0000_0000_0000_0000_1010 = 0xA  
  uint32_t old_value = 0xA;  
  std::bitset<32> new_value(old_value);  
  new_value[0] = 1;  
  new_value[2] = 1;  
  //new_value: Start ADC 0..3: 0000_1111  
  start_adc_ptr[0] = static_cast<uint32_t>(new_value.to_ulong());   

In the code above new_value is initialised with an old value, and then two bits are being set. And that's it. Also this containter extremely simplifies the bitwise programming questions on an interview. new_value.count() returns the number of the bits set, bitwise operations are simplified to the dumbest possible level.

The more I work with C++ the more I get amused by it. And not only by the C++11/14 features (which are great, check out the decltype(auto) functions), but also by the older STL and Boost stuff.

5 April 2016

Vector signal generators

Long story short: modern RF devices are awesome. Especially if you've learned everything you know on the old valve generators and scopes, there's just a huge amount of possibilities.



For example here are the features I personally find the most important and superior to the conventional generators:
  1. Excellent stability;
  2. Remote control. It may be setup-and-go case or some scenario;
  3. As an extension to the previous point: such devices may be used to create an automated test platform. To explain this idea I have to specify the developement cycle of the new equipment.
Let's assume that we have a great and bug-free (which isn't always true) hardware platform, some SoC or an ASIC, and we want to implement a brand-new algorithm. At first developer should do some theoretical investigation to make a plausible model. Models are easy to debug and are very important to estimate the performance and the qualitative characteristics. Then the model has to be modified step by step to approximate or even simulate the hardware platform.

When this stage is over it's time to migrate the algorithm to the external hardware. And to test our brand-new algorithm we create the suiting environment: i.e. series of tests every one on which requires different signal from the generator. Remote control allows us to set up the test script, launch it to acquire the information and have some time to play ping-pong, go for a coffee break, or write an article about how cool is it to be an engineer nowadays.

Today I've found the only flaw in the generator I'm working with: one can't simply run the signal once without an external trigger. The signal state may be set to either the free run (infinite number of repeats), or to the single run by a TTL-trigger. The question is how can I get this trigger (preferably within the Visual Studio)? Two hours of searches and testing stuff from our big box of junk with WinAPI and a multimeter gave me a solution: USB-TTL stick! It's cheap, it interacts with OS like a virtual COM port and it gives great trigger pulses for the generator. 

Of course any additional hardware is a bad decision. There are some other approaches I've tried, but they're either not working so well or too hard to implement. 

Pretty obvious solution is to start a generator, wait for the signal to end (because the samples quantity and the sampling frequency are known) and turn the generator (Arbitary waveform generator if precisely) off.  It may be done with something like this:

 std::this_thred::sleep_for(std::chrono::microseconds(N))  

std::chrono is a great tool, it's really precise, but the problems begin when we're dealing with the fast signals (up to 1 ms), because TCP/IP is extremely slow for this signal and won't stop the generator when it's required. The solution is to fill the space afterwards the signal with a lot (A LOT) of zeroes, but it's extremely memory inefficient. But, to be honest, I'd pick that option if I hadn't came up with a USB-TTL stick idea.

The other way is to generate two equal signal sections (true signal and zeroes), then upload them both to the generation and set up a sophisticated scenario to switch sections on the external trigger. Then make a marker (outcoming trigger) in the end of the signal section and wire the trigger output to the trigger input. Ta-da, magic happens. When the signal is over it'll trigger the zeroes section to run and repeat (up to 65k times according to the specifications). Pretty cool, eh? But too much work to do. If the TTL stick will prove itself inefficient that'll be the way I'll try.

UPD: oops, this is what happens when you write an article too soon. I've found a way to control the single burst signals via SCPI. The reputation of the vector signal generators is restored.

30 March 2016

Development for embedded systems

Disclaimer: In this case my meaning for the "embedded systems" is a bit different from what one might think. Nowadays, Internet of Things (IoT) is a rapidly growing market and developers often refer to modules like esp8226 and other --duino boards.
Today I'll talk more about the GNSS receivers SoCs, which also have very low power requirements, relatively weak CPUs (ARM or PowerPC), but often come with DSP, which comes in handy in the situation of very limited computational resourses.

In the perfect world the only difference one should see between the development for the PC and GNSS SoCs is the performance. But we're not there yet. You have to download the external toolchain, new IDE, buy the JTAG probe (God, they're expensive) and so on. And there's more: because you've changed the compiler, you have to revise the code very carefully. No more C++11 (goodbye templates, auto, nullptr), I haven't yet worked with SoC with implemented STL (no more std::vector, only static arrays). I really love the idea of Microsoft with their Windows 10, that there's one OS and the development tools for every device: laptop, smartphone, tablet and PC. But again, we're not in the perfect world. So don't trust the compiler toolkit. Check the obj file, double check if you have some strange bug or you're not sure. The more specialized compiler is the more unknown bugs and undefined behaviour it contains. Morale: don't trust the compiler!

To run bare-metal application, you should place it to the zero-offset memory. Then the first non-maskable interrupt will start the application. Sounds easy, eh?

Today I learned, that it's not. This method assumes that the first instruction is located exactly at the 0x0. But once again: don't trust the compiler. It may rearrange the code, data and the others sections by his own means. And the worst thing that it won't even tell you this. So today I had to use some duct tape code: assemly function that is placed at zero-offset memory. That function (pre_start) just calls the start. Start consists of some initializations, that are created by the compiler. After the start the main() is called and we can proceed with our bare-metal debug.

The ARM application is now only used to control two LEDs, and to turn them ON I have to write 0 to the GPIO pin... But that's another story about our circuit design department.

27 March 2016

Beidou (Compass) NH-code

Beidou signal is pretty similar to the GPS C/A L1 signal [src].

It is CDMA, and the carrier is being modulated via BPSK with the next binary sequences:

  1. Ranging code with the chip rate of 2.046 Mchips/s;
  2. Navigational message which is modulated on the carrier at 50 bps;
  3. Additional 20-bit long Neuman-Hoffman code (NH-code) with the 1000 bps rate
Carrier with the first two sequences is (if you mind the chip rate) like the GPS C/A L1 signal. To get the GLONASS L1 signal you just need to add the square wave signal with the 10 ms period.

So why adding the NH-code? It is explained to assist the navigation message symbol synchronization in the receiver as well as the additional spectrum spreading. 

Our goal is to remove the modulation caused by this code. When the signal is relatively strong, there is no problem: the receiver just saves the 20 bits from the PLL output, then compares it with the shifted bit mask, and, if it matches, yay, we've found it!


      uint64_t NH_code = 0x72B20;  
      uint64_t input = 0xB208D;  
      for (uint64_t shift = 0; shift < 20; ++shift){  
           //...  
           //Generate shifted code  
           if(shifted_code == input){  
                return shift;  
           }            
      }  

Looks pretty great and obvious, isn't it? It's fast (I mean really fast), it's very low on the memory consumption and, the most important, it's simple. Easy to understand, easy to support.

But it doesn't cover two very important cases:
  1. Change of the sign of the navigational message bit;
  2. PLL errors due to the low SNR.
If we want to stick with the bitwise algorithm, these cases will give us some serious headache. Taking the possibility of the sign change into account immediately makes the algorithm more complicated, because we have to compare the input not with one mask, but with four for every shift!

And the second case makes it even worse. We can no longer rely on the if(shifted_code == input) condition. Now the receiver on every step has to write (memory consumption, remember?) the difference betweed the input and the shifted code. Four times for every sign combination. 

That's not very efficient. That's why I propose the correlation-base algorith. It takes 30 bits (seconds) from the PLL, then generates 20-bit long output and searches for the max value. The position of the maximum is the shift, and the sign of it is the sign of the navigational message bit. It looks something like this:

 #include <cstdint>  
 #include <cmath>  
 namespace{  
      void MatchedFilter1ms(const int16_t *src, size_t src_len,   
           const int16_t *stretched_code, size_t stretched_code_len, int16_t *dst){  
           for (size_t i = 0; i < src_len; ++i) {  
                dst[i] = 0;  
                for (size_t j = 0; j < stretched_code_len; j++)  
                     dst[i] += stretched_code[j] * src[(i + j) % src_len];  
           }  
      }  
 }  
 void main(){  
      MatchedFilter1ms(src, SAMPLES, NH_code, NH_SAMPLES, matched);  
      for (size_t el = 0; el < SAMPLES; ++el){  
           if (abs(matched[el]) > max){  
                max = abs(matched[el]);  
                imax = el;  
           }  
      }  
      int16_t sign = matched[imax] > 0 ? 1 : -1;  
 }  

It uses more memory, which is not what you always want on a embedded systems, but it's sligtly faster with -O2, and much, MUCH, more reliable, as you can see below.


That's it for now. There are still some opportunities to improve this algorithm, but I'm happy with it for now. Also, as far as I know, the truncated NH-code (only 10 bits long) is going to be used in the GLONASS L3 signal. With minor changes this code may be used for it.

An introduction

Hello.

My name is Michael and I'm a software engineer. I have a degree in GNSS and I'm trying to implement my knowledge on the real recievers.
Sounds a bit too official, eh?

Well, my goal with this blog is to categorize and record some stuff that is interesting for me. Basically, it would be about the software developement, but who knows.

Hope you'll find this interesting and educational.