Skip to main content

Posts

Timeout function if it takes too long to finish in Python

import errno import os import signal import time class TimeoutError(Exception): pass class timeout: def __init__(self, seconds=1, error_message='Timeout'): self.seconds = seconds self.error_message = error_message def handle_timeout(self, signum, frame): raise TimeoutError(self.error_message) def __enter__(self): signal.signal(signal.SIGALRM, self.handle_timeout) signal.alarm(self.seconds) def __exit__(self, type, value, traceback): signal.alarm(0) with timeout(seconds=3): time.sleep(4)

Correlated Cross-Occurrence (CCO): How to make data behave

Cross-occurrence allows us to ask the question: are 2 events correlated. To use the Ecom example, purchase is the conversion or primary action, a detail page view might be related but we must test each cross-occurrence to make sure. I know for a fact that with many ecom datasets it is impossible to treat these events as the same thing and get anything but a drop in quality of recommendations (I’ve tested this). People that use the ALS recommender in Spark’s MLlib sometimes tell you to weight the view less than the purchase. But this is nonsense (again I’ve tested this). What is true is that *some* views lead to purchases and others do not. So treating them all with the same weight is pure garbage. What CCO does is find the views that seem to lead to purchase. It can also find category-preferences that lead to certain purchases, as well as location-preference (triggered by a purchase when logged in from some location).  And so on. Just about anything you know about users or c...

Backup WordPress Database And Filesystem Data On Linux With Scripts

f you’re like me, you run a WordPress blog and are terrified of the thought of something going wrong.  With core updates, theme updates, plugin updates, and server component updates, there is a lot of room for error.  This is where a WordPress backup could help ease your mind. WordPress recommends taking a backup of your blog before any of these are done and there are even some popular plugins that will do this for you.  For example, you could use the popular  UpdraftPlus  or similar, but I believe there is room for error in those as well.  While I could be wrong, I think WordPress must be in good shape for backup plugins to be successful. The alternative would be to create your own backup scripts that run on a cron schedule.  We’re going to see how to do this for WordPress instances running on a Linux machine. Creating the Backup Script There are two core components that need to be backed up in case of a catastrophe.  You need to back...

Classification and Clustering

In order to write a tutorial about classification, it was necessary to find an example that was broad enough that it would need to be sub-divided. Since I actually care about whether you remember this stuff, it needed to be something that a lot of people like and would relate to. And since I have a lot of international subscribers, it needed to be cross-cultural as well. So what is universal, cross-cultural, and dearly loved? Beer. There’s American beer, Mexican beer, German beer, Belgian beer….hell, even the Japanese make beer. There’s IPA, Lager, Pilsner. Dark, light, stout. There are so many ways to classify beer that we could spend weeks doing it (so naturally, I did). Now, before you can classify anything you have to determine the characteristics that you’re going to use. For beer you could use country of origin, color, alcohol content, type of hops, type of yeast, and calorie count among other things. That way you could sort based on any of those characteristics t...

Redefining Decision Making

n today’s world of big data and the internet of things, it is common for a business to find itself sitting atop a mountain of data. Possessing it is one thing, but leveraging it for data driven decision making is a much different ball game. Gut-feelings and institutionalized heuristics have traditionally been used to guide development of protocol and decision making, but the world of artificial intelligence and big disparate data is changing that. Everyone is trying to make sense of, and extract value from, their data. Those that are not will be left behind. This challenge (and opportunity) is not limited to certain industries. For instance, most companies are exploring how they can use data to make better marketing decisions, most retailers are using data to optimize their supply chains, and most manufacturers are using data for quality control of final products.  Almost all business problems (with surrounding data) can be broken down into two categories: supervised and unsu...

Enterprise Architecture and Cloud

Amazon Web Services AWS is primarily Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). AWS has a wide scope of different services allowing you to configure entire complex, powerful, secure, scalable and high-available IT environments consisting of private networks, gateways, load balancers, servers, storage, databases, monitoring etc., all virtual and to be set up through configuration wizards or scripting. Moreover AWS provides advanced services like containers, serverless computing, machine learning, message queuing etc. giving you a headstart with those technologies without the upfront platform investments. The scope of AWS can range from hosting a single simple solution like a web server, to a virtual high available hosting facility, fully replacing physical on-site hosting facilities. Also hybrid solutions where AWS acts as a cloud-extension of on-premises IT are possible. Amazon Web Services is one of the leaders in IaaS/PaaS. AWS has set the standard...