Simply Slideshow Version 1.2.0 Released

This is the latest release of the Simply Slideshow which a few suggestions incorporated. This time around I compiled the app to target 64bit CPU to get around the “Out of Memory” errors some of you were having. It seems sensible to change it to 64bit now with the number of 64bit CPU currently out on the market. Here are the updates:

  1. Multi-Windows Support: You can now launch multiple windows using the same playlist and they are all sync up together.
  2. New Framerate Control: Allows you to adjust the fps value. Default is 30 fps, look for the TimelineFrameRate setting field if you want to change it.
  3. Upgraded to 64bit.
  4. Upgraded to .NET Framework 4.5.1. You might need to download it from Microsoft.
  5. General Stability Issue Fixes.
  6. Switched from Windows installer to ClickOnce installer.

and uninstall the previous version before installing this one otherwise you get duplicates.

Download at:


An Introduction of Predictive Analytics in Data Mining with Big Data

This article is based on the Master essay title “A Survey of Predictive Analytics in Data Mining with Big Data” (Lam, 2014).


Predictive analytics is currently an emerging term, a term that emerged out of the more established domains of Business Intelligence and Data Mining. Predictive analytics is a field derived from Data Mining which is corroborated by the centuries old disciplines of mathematics and statistics.

This article aims at providing a brief overview of Predictive Analytics field and the common practice involved in this young but evolving field. The typical questions arise are “What is a prediction?” and “How does prediction differ from estimation?”. To answer these questions, one must first understand the differences between the explanatory modelling and predictive modelling. The goal of the explanatory modelling is to provide descriptive context of a situation, aimed at discovering and explaining the causal links between cause and effect constructs (Shmueli, 2010). Causality plays a central role in explanatory modelling as shown in Figure 1. Predictive modelling, on the other hand, does not fixate on causality but focus on the association relationships between measurable variables as shown in Figure 2. In other words, predictive modelling is grounded on the notion of measuring direct and indirect effects of any causes. This is important because measuring the effects of aggregate causes provide a higher degree of prospective indication to future events than explaining the causal effects of any pairs of causal links.

An Introduction of Predictive Analytics in Data Mining with Big Data - Figure-1 An Introduction of Predictive Analytics in Data Mining with Big Data - Figure-2

The reason behind the shift of focus from causal links to association relationships is that, accurate prediction demands large set of data and even to the level of Big Data. Since we are interested only in the association relationships of measurable variables, Big Data provides an abundant source of data necessary for building up a predictive model.

Simply put, the process of prediction is to infer unknown outcome through the analysis of known outcomes. A definition of Predictive analytics is shown below:

“To maximize the signal-to-noise ratio through the analysis of Big Data. To use the result of such analysis in combination of the advanced techniques of statistical modelling and the assistance of high performance computing devices, to derive meaningful information that provide a higher-than-guessing accuracy and precision. The derived information is capable of predicting trends and the validated result of each prediction will be used in updating the underlying statistical model continuously and perpetually.” (Lam, 2014)


From Classification To Prediction

Categorization is a very basic methodology for human to make sense of the world. Human use categorization to divide complex situations into meaningful and digestible forms, simplifying complications is the principle approach in general problem solving.  Categorization sets the stage for classification where concept groups begin to emerge which provides distinguishing classes for object classification. Classification is the foundation of Data Mining, this is because classification allows us to group and nest objects together that exhibit similarities based on a posteriori knowledge. The prior knowledge that are formed as a result of categorization maturing into classification. Which is to say, if a group of objects exhibit certain degree of shared identifiable traits and that the group itself possesses enough distinguishing characteristics from other similarly grouped objects, then the characteristics of the group constituents become the differentiable property of a class.

Once we amass enough amount of classified information and presenting them as discrete classes, the building blocks for prediction are created. The reason behind this classification-to-prediction transformation is due to the nature of historic recurrence. New event has a tendency to occur in the same fashion as previously occurred events. Thus, naturally a repetitive pattern can be observed given enough time passes. The repetitive pattern in nature serves as a high level predictor for future events. This applies to everything from annual influenza virus strain prediction to financial engineering applications. To illustrate the relationship between classification and prediction, we use insurance company as an example.

Insurance companies employ classification methodology to identify risk level and to set a fair premium paid by their clients. A client’s attributes affect the premium to pay in order to be insured. For health insurance, these attributes are age, gender, income level, occupation, marital status, pre-existing and previous history of health problems are all examples of factors used in analyzing and classifying individuals into discrete classes with varying risk scores and pricing levels. We can say this is a form of risk adjusted prediction by the insurance company in order to maintain a positive revenue because an individual whom exhibits certain high risk factors would belong to a class that demands higher premium to offset the higher perceived risk. These high risk factors are often determined empirically, which is to say, past experience demonstrated a pattern where the individuals with certain attributes are predisposed to certain degree of risks. Therefore, a prediction is made on the likelihood of certain health risks (e.g. cancer, terminal illness and premature death) when the individual is assigned to a class. The above example illustrates the basic principle of how classification can aid in making predictions.

We have discussed how classification relates to prediction and many techniques exist that operationalize classification methodology. The simplest and well-known is the decision tree method. The basic form of a decision tree is derived from a binary tree structure with conditional nodes leading to breaches that allows for a top-down nodes walkthrough for the purpose of classification. This is done by starting at the topmost node and descend down the tree hierarchy where small and incremental decisions are made at each node pertaining to the object attributes in questions. When we arrive at one of the lowest nodes then we arrive at a class that the object belongs to. Of course, the above method applies a very coarse grain approach to predict. The logical operation is such, if an object is classified and to be enclosed by a set of predetermined attributes, this object is said to behave similarly in the ways that other objects belonging to the same class would and have behaved. This is the basic of classification-to-prediction approach, a simple approach rested on a deterministic thinking. However, a deeper dive into the details is important to transcend simple prediction to advanced Predictive Analytics.

A Probabilistic Approach

There can never be absolute certainty in prediction as the basis of prediction rests on statistical inference of what have been known and recorded. As such, a non-deterministic means of thinking about and measuring predictions provides an actionable and pragmatic approach to prediction, that is, stochastic based predictive methods. A probabilistic approach to answer questions that rest on the concept of chance.

A person who uses tobacco products have a 40% probability of developing lung cancer. The 40% figure portrays a non-deterministic outcome of a given cause, the figure also conveys the message that four in ten people would result in lung cancer due to consuming tobacco products. Further, it also tell us that the 40% figure is based on an aggregate measure of historical events from a sample dataset, a statistical reference derived from statistical analysis. However, what the figure does not convey is the specifics of why six in ten people will not develop lung cancer while the other four people will for the same reason (i.e. tobacco). The reason is because we cannot determine with certainty of the causalities of all possible factors involved. While tobacco use correlates with lung cancer incidents, there are many factors that can contribute to the same effect, such as hereditary reasons, prolonged exposure of harmful chemicals, emotional stress and dietary plan, etc. Each individual have a certain degree of susceptibility of developing lung cancer for different reasons and different combinations of reasons, with and without environmental factors. There is no certainty in this measure and as such, we present the 40% reference figure as a statistical and probabilistic measure based on prior incidents.

Augmenting Predictions with Big Data

Factors for predictions are called predictors and more predictors available generally produce higher accuracy predictions. Even a small number of predictors would result in an overwhelming number of permutations, arrangements and combinations of factors. Big Data accelerate this problem as the exponential growth of these predictors accelerate and the association relationships that exist among them begin to muliply.

Big Data is known for its properties of three Vs, that are, volume, variety and velocity (, 2014). Each of these properties increases the number of data dimensions for us to correlate factors and analyze relationships. This is both beneficial and detrimental to our ability to sift through the massive amount of data. More data comes with more noise in additional to the signal that would provide us with deeper context. The added data dimensions provided by Big Data inset the contextual information necessary for deep machine learning. For instance, suppose we are to diagnose a patient’s predisposition of Alzheimer decease. The genetic and health information are the core factors while the behavioural information are supplementary to the diagnostic. Behavioural information such as dietary habit, the level of physical and cognitive oriented exercise, the size of the patent’s social group, and even the patent’s personal interests could be correlated in many different ways. Not to mention the biometric data produced by health monitoring devices continues to supplement the overall personal Big Data in realtime.

More data means more context and thus Big Data provides us with the much needed context in any predictive modeling endeavor. More is not just more, more is different (Anderson, 1972). Building a context-aware predictive model requires the consumption of high degree of multi-dimensional data, contextual information that are deduced from the data outside of the set of core factors. Thus, Big Data enables us to extract the vital data embedded within the many sources of data combined with the criticality of realtime data for us to discover the underlying correlations and relationships.

Issues and Trends

The issue with Predictive Analytics and the complementary role of Big Data goes back to the definition in the beginning of this paper where the goal of Predictive Analytics is to maximize the signal-to-noise ratio within any given set of data. The objective reveals the ongoing issue of maximizing our ability to filter out the noise from the signal, to which we have made great advancements through the improvement of data modeling and algorithms. To that end, the study by (Niculescu-Mizil & Caruana, 2005) suggested a performance plateau involving various commonly used techniques through extensive benchmark and model calibration. The result indicated the performance of each individual machine learning method provides no significant gain from one another, which is to say, calibrated SVM has approximately the same predictive performance as the artificial neural network as well as decision tree when they are given the same dataset.

The result from such findings gave birth to the ensemble approach. The basic premise of ensemble approach is to embrace diversity in aggregate where collective intelligence can be achieved. Ensemble methods are gaining momentum, it is because the underpinning meta-learning idea of combining weak leaners to make a strong leaner means existing model performance can be improved by aggregation. The trend of ensemble modeling is exemplified by the winning team BellKor’s Pragmatic Chaos ensemble team in the Netflix Prize event (Koren, 2009).

The concept drift is another ongoing challenge that many researchers have faced (Venkatesan, Krishnan, & Panchanathan, 2010). Concept drift defines the phenomena involving changes in the predictive nature of the independent variables used in the underlying data and model. Concept drift describes data that exhibits a shift in variable relationship in concept, which is different from the training data used during the model’s supervised learning process. In other words, the training data used to train a model no longer representing the current data being processed by the model. The concept drift can be detect by different techniques and AdaBoost is one of the most commonly employed method.


Anderson, P. W. (1972). More Is Different. Retrieved from (2014, 06 30). Gartner IT Glossary – Big Data. Retrieved from

Koren, Y. (2009). The BellKor Solution to the Netflix Grand Prize. Retrieved from

Lam, D. W. (2014). A Survey of Predictive Analytics in Data Mining with Big Data. Edmonton, Alberta, Canada: Athabasca University. Retrieved from

Niculescu-Mizil, A., & Caruana, R. (2005). Predicting Good Probabilities With Supervised Learning. Proceedings of the 22 nd International Conference. Cornell University. Retrieved from

Shmueli, G. (2010). To Explain or To Predict? Statistical Science. Retrieved from

Venkatesan, A., Krishnan, N. C., & Panchanathan, S. (2010). Cost-sensitive Boosting for Concept Drift.

A Survey of Predictive Analytics in Data Mining with Big Data

Ok, this is it. I am done with my program and concluded it with this essay title “A Survey of Predictive Analytics in Data Mining with Big Data”.

If you are interested, have a look at here.

What this means is that I can now get back to updating the apps with features and bug fixes that are long overdue…


Version 1.1.63 – Simply Slideshow

This is a light release with only one suggested feature: Open Slideshow Image(s). You can now click on the folder button (alongside with the slideshow control buttons) to see the current image(s) selected in Windows Explorer. Note: If you installed the 1.1.62, you need to reinstall it with 1.1.63. Note: the default folder path of “C:\Users\Public\Pictures\” sometimes cause the program to crash, remove it from the list if you experience application crashing.

You can get it @softpedia or @majorgeeks.


Wow – Grand Prize Winners Chosen!

A while back I signed up with the Microsoft Developer Movement Canada competition to create awesome phone 7 apps and publish them to the MS marketplace. There are 3 levels of achievements from 1 app (reward = some tech toys), 2 apps (reward = more tech toys or gift card) to 3 apps (GRAND PRIZE!). I was informed by MS some time ago about the approval on 2 of my apps and received some of the rewards. Last week I got an email saying that I am one of the 5 winners of the grand prize award in the national competition! Great news! Check out the announcement here: and here





Simply Moods @ Marketplace

This is my thrid windows phone 7 app published on MS marketplace. It is a simple tool to track your mood of the moment and view them at a later date. If you have used Simplify Receipts you will have no problem getting started on this app, very similar in terms of usage flow.

You can find it at:


Version 1.1.60 – Simply Slideshow

Finally got a chance to update it to the current version 1.1.60. You can download the software at CNET or @here. This version addressed many reported bugs in the previous version and added “Repeat Slideshow When Ended” option to repeat slideshow for non random play. Note that if you downloaded the program from cnet, they have a custom installer that prompt you to install third party tool bar, etc. if you are not interested in those suggested installation, make sure you select “I do not agree” option before clicking on the next button.

Dickson Review

Not too many software hosting site would go as far as They had informed me of a review they did Simply Slideshow coupled with a video tutorial along with a virus scan report. They have gone to great length to review Simply Slideshow. They also grant the software a five stars award.


Simply Slideshow antivirus scan report at

“’s team has also made a VIDEO TUTORIAL showing the installation, the main interface, the main features of Simply Slideshow, letting our visitors as well as the potential users of your product take a glance at the main qualities of this software before downloading it. This video tutorial is also intended to help beginners understand how to use your product:
Simply Slideshow Installation Video Tutorial


Reviewed @

Here comes another great review from a German site on Simply Slideshow. Luckily for us, we got google translate to bring it to English. Check it out in english or the original post in german. Many thanks to the author on such a detailed review.

There is another one at in english and german. And this one at

Simplify Receipts on Marketplace

I am using this app to record receipts from all the transactions and digitize them neatly on my Windows Phone. I developed the app without a db engine but still sufficent to store and retrieve reasonable amount of data. I plan on improving it in the future if I am seeing more interest from the users. You can find the app at windows marketplace.

Simply TileMatch!

Yet another concentration game 🙂 I decided to explore the phone 7 platform and so Simply TileMatch! was born! If you are amongst the 2% (as of today) who happens to have a Windows Phone then you can go download it at the windows marketplace. An check out the cool review by George Ponder at wpcentral and many thanks to George to take the time do the review!

Here is a quick look at the game:

Simply Slideshow 1.1.22

Simply Slideshow version 1.1.22 is now available to download at CNET. The updates include:

  • Added 2 more slideshow styles: Simple Filled and Simple Fitted
  • Updated UI
  • Bug fixes
  • Optimized for large amount of images

Since most users experience with the older verions prefer the simple slideshow style found in other slideshow software. These updates should satisfy the both group of people with opposite preferences. The simple slideshow works better in “Stay At Bottom” mode because it creates less distraction.



Simply Slideshow Version 1.1.10

The latest version of Simply Slideshow is now available to download at CNET. This version contains numerous updates and enhancements. I was also able to include some of the suggestions from the emails I recieved. The updates include:

  • Added 5 more transition effects
  • Updated UI
  • Bug fixes
  • Optimized for large amount of images.

Simply Slideshow Version 1.0.35

I have received over a dozen emails requesting new features of Simply Slideshow. Most of you wanted to adjust the speed of the animation so I incorporated that into this latest version. You can download the latest version of Simply Slideshow from CNET or from Softpedia.

What’s changed?
• Added the ability to adjust the animation speed.
• Added Pause button.
• Added Play button.
• Some minor bug fixes.

A Great Review by Manav Mishra!

Today on, you will find a great article by Manav Mishra written just hours ago. His take on “Simply Slideshow” is the most comprehensive to date in which Manav really captured the essence of the tool. Like Manva suggested, the build-in windows slideshow just don’t do it right and that prompted me to build one. He talked about the interface in good details and the benefits of it being simple and lightweight (< 1mb) which is exactly how I want to design it - simple. I would certainly look into Manva’s suggestions in the upcoming release. You can find the article on at

Thanks Manav for your awesome review!


Intro on

I believe the post is in Swedish so I go to google for help to translate it.

Simply Slideshow is a free and easy program to use for playback. You can include choosing whether to show images in full screen or not, randomise the images and also choose whether the random images to appear more than once.
The installation file is approximately 1MB.

Check it out at: