Sort by Regex

Rax string reduction, /cat(), on the IMPALA backend was not trivial to implement. The IMPALA documentation states: GROUP_CONCAT … does not support the OVER clause, … Effectively this means no ORDER BY on GROUP_CONCAT in IMPALA. This must be implemented eventually, because the whole GROUP_CONCAT has limited use without it. Meantime, regexes can be used …

Rax 1.1 released

We are happy to announce that Rax 1.1 is avalaible from today! This release contains many stability and performance improvements. Most importantly: Added r’years and a’weeks magic tags to relative duration. Added a sample \ operator for sets: some_set\10 will list random 10 elements of some_set. In explorative analytics scenarios this is handy as it’s …

Understanding Time

With behavioral data, time plays a very important role. Yet, time-related data is especially hard to analyze. The main culprit is the fact that time concepts are confusing. While humans can handle time intuitively, passing these intuitions to a computer program is hard. For example, implementing a growth rate using a relative duration of “twelve …

Kinda Like Assert

In 1983 my math teacher wrote on the blackboard: P{Q}R. Ever since, my code is loaded with assertions. “Assertions should be used to document logically impossible situations and discover programming errors […] This is distinct from error handling: most error conditions are possible, although some may be extremely unlikely“. (Wikipedia) [Assertion] We all have seen, …

Pointer To Pointer

Some hold that pointers in a language like C are dangerous and hard to master. Both maybe true, but so are scalpels. Since you are reading this, you know that pointer practice makes pointer perfect. The same goes for pointers to pointers. I remember vividly when I first saw some elegant code that used a …

Skip List

The skip list is a relative unknown data structure. If you know it already you can stop reading. If not, keep reading because “the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees.” (William Pugh) [Skip lists: a probabilistic alternative to balanced trees]. Speaking …