Crossknowledge is treading new ground in the way it provides learning experiences and knowledge to its customers. This will to innovate and to always renew is also reflected in the way we develop our solutions and in our technical choices.

Last year, we developed for our customers a brand new analytics functionality. They now can see several pertinent indicators in a user-friendly way, with a very fast-loading.

It was the fast-loading constraint which was challenging. How can we serve fresh computed data from several millions of entries in less than 3 seconds without overload our databases? That’s how we come to Pandas. What is it? What problems does it solve? How we use it in Crossknowledge? You will find all the answers by reading this article.

What is Pandas?

Pandas is the Python Data Analysis Library. It’s an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. In brief, it is made to aggregate and denormalize data in a very fast way:

Pandas usage

What problems does it solve?

First, let’s review the challenges in the context of our analytics functionality:

  • Aggregate and render millions of data in a very fast way;
  • Do not overload our databases

As we need to update the data once a day, Pandas perfectly meets our problems. Indeed, it allows us to import data from a CSV file, so we only have to export the data of our databases in CSV file and let Pandas aggregate and denormalize them. Then, we just need to retrieve the Pandas data in the LMS and render them in our dashboards.

By treating our data with an external tool, we avoid database overloading and their aggregation makes them much more faster to retrieve.

How do we use it?

In our case, we want to aggregate daily data such as the number of connexions or the number of completed trainings for a given day.

So every day, in the middle of the night, we export the needed data from our database in a CSV file that we upload in an Amazon S3 bucket. Our Dataviz Pandas consumes the CSV file to aggregate and denormalize the data.

Once it is done, thanks to the REST API provided by the Dataviz, we can directly retrieve the Pandas data from our Learning Suite and display them. The following schema describes how it works:

Pandas simple architecture

This is how it works in a very simple way. In practice, we added a Varnish cash system between our Learning Suite and the Dataviz Pandas to reduce the data retrieving.

As you can see in the schema bellow, the only thing that change is that we make a purge request to the Varnish cache each time we upload a new CSV file in the Amazon bucket:

Pandas full architecture

To be short, by aggregating a large amount of data and by making their reading fast, Pandas matches our expectations in the context of an analytics functionality that displays a lot a dashboards.

During the development of our new learning path technology, we faced some issues with responsiveness. One of them was to make a responsive iFrame containing a video.

The main challenge was to make the iFrame responsive by keeping its ratio, so we avoid the black bands around the video at resize.

The easy dirty way

Our first approach was to put the iFrame .crossPlayer-iframe inside of a DIV .crossplayer-wrapper, and set width and height of our iFrame to 100% (so it fits its container)

Now we have to give a size to our container. Remember: we want a responsive behavior. So let’s set to .crossplayer-wrapper a width of 100%, and an auto height so it will be sized with our iFrame content.

HTML:
<div class="crossPlayer-wrapper">
    <iframe class="crossPlayer-iframe">
        ...
    </iframe>
</div>
CSS:
.crossPlayer-wrapper {
  width: 100%;
  height: auto;
}

.crossPlayer-iframe {
  width: 100%;
  height: 100%;
}

Hoho… By doing this, the height of .crossplayer-wrapper is not modified once the iFrame content is loaded…

Okay, let’s add a bit of javascript to compute the height with a 16/9 of the width!

Wait… I will have compute the height every time the window is resized if I want to keep the video ratio?!

See the problem? Let’s think to a better way!

The responsive way

Back to the start. We have our .crossPlayer-wrapper and our .crossPlayer-iframe.

What we need to make .crossPlayer-iframe responsive and keep its ratio, it’s to have .crossPlayer-wrapper perfectly sized at the loading and to keep its ratio on resize without any javascript.

The solution is to set a placeholder in .crossPlayer-wrapper that will have the same ratio properties

Hey! We have a native HTML element which does the job: an image! Let’s try with it!

HTML:
<div class="crossPlayer-wrapper">
    <img class="crossPlayer-placeholder" src="data:image/gif;base64,R0lGODlhEAAJAIAAAP///wAAACH5BAEAAAAALAAAAAAQAAkAAAIKhI+py+0Po5yUFQA7" />
    <iframe class="crossPlayer-iframe">
        ...
    </iframe>
</div> > Please note that the img src attribute contains the data of a transparent image with a 16/9 ratio.
CSS:
.crossPlayer-wrapper {
  position: relative;
}

.crossPlayer-placeholder {
  display: block;
  width: 100%;
  height: auto;
}

.crossPlayer-iframe {
  position: absolute;
  top: 0;
  left: 0;
  width: 100%;
  height: 100%;
}

Here we are! The .crossPlayer-wrapper will fit the image size (which keeps its ratio) and the iFrame, set in position: absolute; is placed hover the image and take all the space given by .crossPlayer-wrapper.

Moreover, we don’t use any particular JS or CSS trick, so this solution works even with old navigators such as IE8

Nowadays, front-end workflow requires developers to run many repetitive tasks such as :

  • minification of assets
  • images compression
  • compilation or transpilation
  • preprocessing CSS (Sass, Less, Stylus)

All these tasks don’t need to be ran manually. How about if we use automation for these kind of tasks? Using a task runner tool is the answer to this question. Basically, Gulp and Grunt are task runners which handle these tasks. However, Gulp prefers code over configuration. The tasks are defined by writing code. This allows the creation of tasks that fit our very specific needs at CrossKnowledge.

Get started with Gulp

Gulp works on a NodeJs environment. So please install Node on your machine, if it is not already done!

Then, install Gulp globally with this command: npm install -g gulp.

Installing globally a package allow you to use the package in your shell / command line as the binaries end up in your PATH environment variable.

In order to use Gulp you have to create a gulpfile.js to your project root. Gulp’s API is very simple and is composed of 4 functions:

  • gulp.task
  • gulp.src
  • gulp.dest
  • gulp.watch

Let get started by creating our first task by using gulp.task to define a task:

// gulpfile.js
var gulp = require('gulp');

gulp.task('taskName', function() {
	// do something
});

Write a gulpfile.js

Create a basic task

One of basic automation tasks is, for example, the minification of our javascripts. This tasks expose the main API of Gulp so you can create a simple and functional example.

First of all, please install the plugin gulp-uglify: npm install --save-dev gulp-uglify. This plugin will handle the minification of your javascript files.

The --save-dev option writes the dev dependencies into your package.json. When you execute npm install command from your directory where package.json is located, it will install all dependencies required by your project.

Note that there is a shortcut for the --save-dev option. npm install -D gulp-uglify will do exactly the same thing than npm install --save-dev gulp-uglify.

var gulp = require('gulp');
var uglify = require('gulp-uglify');

gulp.task('minify-js', function() {
    gulp.src('./src/scripts/*.js')
        .pipe(uglify())
        .pipe(gulp.dest('./dist/js'));
});

Important: gulp-uglify leads to an ‘error’ event when it is unable to minify a specific file. Until the release of Gulp 4.0 you need to handle it or use the gulp-plumber plugin.

To run this task, just run gulp minify-js.

Add another tasks

If you want to use a CSS preprocessor such as Sass, Gulp is the perfect match. Install the according plugin and here we go: npm install --save-dev gulp-sass

// gulpfile.js
var gulp = require('gulp');
vqar sass = require('gulp-sass');

gulp.task('sass', function() {
  return gulp.src('./src/styles/main.scss')
    .pipe(sass())
    .pipe(gulp.dest('./dist/css'));
});

Here again, you need to use gulp-plumber if you got the ‘error’ event and if you don’t want the process to crash.

To run this task, just use gulp sass.

Now, let see a last task. We built some files with Gulp. But maybe you don’t want to delete the generated files yourself. So let’s create a ‘clean’ task

We will use del: npm install --save-dev del

var gulp = require('gulp');
var del = require('del');

gulp.task('clean', function() {
    del(['./dist/css/*',
        './dist/styles/*'
    ]);
});

Default tasks

Ok, now we have 2 tasks. We launch them with gulp my_task. Imagine if you have dozen of tasks? Or if you want a task A when the task B is completed?

Run several task in parallel

Now we create a ‘build’ task which run ‘sass’ and ‘minify-js’ tasks simultaneously.

gulp.task('build', ['sass', 'minify-js']);

When you run the ‘build’ task with gulp build, Gulp will start ‘sass’ and ‘minify-js’ tasks.

Checkout run-sequence package if you want to run a specific order of dependent gulp tasks.

Dependencies tasks

Often, you need to run a task when another task is completed. With this example, I introduce [gulp-util].(https://github.com/gulpjs/gulp-util). Install it with this command: npm install --save-dev gulp-util. gutil is the official toolbelt of Gulp, which provides some utilities.

var gulp = require('gulp');
var gutil = require('gulp-util');

gulp.task('isReady', ['build'], function() {
	gutil.log('The build is complete');
});

Note: gutil.log will output your message formated in the Gulp way to your console / terminal.

The default task

There is one last thing you have to know, and maybe the most important one. What happens when you execute this command? gulp. Basically Gulp will look for the ‘default’ task and try to run it. So if you don’t specify that task the process will stop.

Here is our final gulpfile.js

// gulpfile.js
var gulp = require('gulp');
var sass = require('gulp-sass');
var uglify = require('gulp-uglify');
var del = require('del');

gulp.task('minify-js', function() {
    gulp.src('./src/scripts/*.js')
        .pipe(uglify())
        .pipe(gulp.dest('./dist/js'));
});

gulp.task('sass', function() {
  return gulp.src('./src/styles/main.scss')
    .pipe(sass())
    .pipe(gulp.dest('./dist/css'));
});

gulp.task('clean', function() {
    del(['./dist/css/*',
        './dist/styles/*'
    ]);
});

gulp.task('groupTask', ['sass', 'minify-js']);

gulp.task('build', ['clean']);


gulp.task('default', ['build']);

Conclusion

Gulp is an excellent task runner which is “code over configuration”. This covers very specific needs.

For further reading, start with official recipes from Gulp.

The gulpfile.js we have built together in this introduction is very simple. But there is a lot of configuration you can manage with Gulp! There is a plugin for everything. If you wish, you can also create your own one. Do you want Gulp to manage the Browserify compilation? Checkout this recipe. Are you using Less instead of Sass? Take a look to gulp-less.