Category: Software Engineering

June 17, 2020
Build your own Search Engine

Post by Dr. Rutu Mulkar-Mehta In this post, I will take you through the steps for calculating the $ tf \times idf $ values for all the words in a given document. To implement this, we use a small dataset (or corpus, as NLPers like to call it) form the Project Gutenberg Catalog. This is […]

June 17, 2020
The Math behind Lucene

Lucene is an open source search engine, that one can use on top of custom data and create your own search engine - like your own personal Google! In this post, we will go over the basic math behind Lucene, and how it ranks documents to the input search query. THE BASICS - TF*IDF The […]

November 3, 2017
Docker Deep Learning - GPU-accelerated Keras

Machine Learning consulting companies should also be adept at software engineering, right? In this post, I'll show you how to prepare a Docker container able to run an already trained Neural Network (NN). It can be helpful if you want to redistribute your work to multiple machines or send it to a client, along with […]

November 3, 2017
Docker Swarm - how and when to use it?

In this post I'm going to analyze the role of Docker in each stage of the application lifecycle and try to highlight cases when you should consider moving to Swarm. It's not an introductory tutorial, but I'll create one in the future. Development with Docker Docker really made my life much easier. Let's just say, […]

August 15, 2017
Elasticsearch: 10 Advices to Get Started

We have recently finished an innovative, data-driven project based on Elasticsearch. The aim was to find similarities between objects across sets. Sets were static, although they were a decent size (90+ million records) and there was a requirement that search was fast (nearly instant) - so Elasticsearch was the best choice. During the project, I […]

April 11, 2017
SSLForFree - Setting up SSL with NGINX and LetsEncrypt

We all know that sweet green padlock in our browsers, meaning that our connection to a website is secure. Don't underestimate encrypted connection! Without it, all data is sent in plain text. It's very dangerous if your site handles secret data, like passwords or emails. Recently, my colleague from work logged into our office router. […]

April 5, 2017
Git Hooks - Automatic Code Quality Checks

We all strive to achieve great quality code. Every language allows us to run some quality checks or automatic unit tests. But even best tests won't help if they aren't run often. Remember! If something takes too much time or effort, people will avoid it! Solution? We can reverse that! Let's make automatic tests effortless, […]

March 24, 2017
Tensorflow AWS setup - proper setup of version 1.0

After long development, Google released the first stable version of its Machine Learning library, TensorFlow. The release is an important milestone in the development of a common Machine Learning toolkit. TensorFlow provides a set of primitives from which Machine Learning engineers and researchers can construct trainable models — as well as a framework to run these computations […]

January 10, 2017
How to automatically fill PDF forms using Python and pdfrw

Recently at Sigmoidal we had a curious case of filling PDF forms for our users. They can print them out pre-filled by us and use. We had plenty of those forms to set up, so an efficient way of doing it was required. Solution 0 — Putting Texts In Python The simplest solution goes like this: Take […]

