Mark van Venrooij

Mark van Venrooij's blog

Author: Mark van Venrooij (page 1 of 10)

Learning domain-driven design

On my new job the department is applying Domain-Driven Design. I did read Domain-Driven Design – Tackling Complexity in the Heart of Software by Eric Evans a couple of years back. However as I never used it I need to learn how to apply it. So in order to really learn DDD I started to reread Eric Evans’ book and plan on reading Implementing Domain-Driven Design by Vaughn Vernon. As theory and practice are only equal in theory, I want to apply DDD to my favorite pet project. To recap an earlier blog post, many years ago I started categorizing all my financial transactions. At first I used to build a tool myself to give me insight in my spendings, later I bought something to do this for me. However this is now the domain I know pretty well and use to learn new things.

Current progress

Currently I read till chapter 5 of Eric Evans’ book. The key things I learned in the first chapters:

  • Make domain concepts explicit by capturing the knowledge in the design
  • Define a ubiquitous language using domain terms
  • The feedback loop from coding back to the design is often missing
  • The domain must be isolated from the user interface, db glue and other supporting code in order to achieve separation of concerns
  • The smart UI cannot be combined with DDD

Moving forward

My plan is to finish reading both books before the end of the year. (I hope I can find the time, given that I have a young daugther.) In the same time I plan to work on my pet project. As I learned long ago a good way to learn things is to teach others. So I want to share my progress in implementing DDD on my pet project through this blog. So the coming weeks expect some new blog posts on this topic. Basically me struggling to apply the DDD concepts.

quick update

A long overdue update from my side. As mentioned in previous post I got married last year. Next to that this year we welcomed a newborn daughter. So my life changed a lot the last 12 months.

Update on my path to financial independence

We saved a bit less than the goal (25%) of last year, this is basically due to buying a nursery and all other things for my daughter. However having a young daughter compensates for that a easily. The Savings as percentage of FI goal basically hit the goal as the stock market was pretty good. Ignoring any market changes and interest it would take approximately 36 years to be financial independent giving the total savings of last year. The goals for this year are a bit more modest compared to last year. It should be possible to reach the 20% savings rate and that would result in reaching 18% of my set FI goal.

Savings rate 2016

Savings rate year 18.5%
Savings as percentage of FI goal 14
years till financial independence using total amount of savings of last 12 months 36

Financial goals 2017

Savings rate 20%
Savings as percentage of FI goal 18%

financial independence

Happy new year to everyone I wish you all a healthy and succesful 2016! In my previous post I mentioned that I would explain more about my path to becoming financially independent.

What is financial independence

According to wikipedia financial independence means:

Financial independence is generally used to describe the state of having sufficient personal wealth to live, without having to work actively for basic necessities. For financially independent people, their assets generate income that is greater than their expenses

How does it work

There are many blogs that explain how to get financial independent. Let me try to give a quick summary.

As stated the main goal is to let my assets generate enough income to cover for my expenses. The question is then how much assets do I need. This question is not easy to answer and it depends on your personal situation. There many ways to generate income with money, e.g. becoming a landlord/landlady. In my example however I’m going to use the stock market to generate income as this is the most common scenario.

As we all know stock markets go up and down and nobody can predict it correctly. Given that there is this much of uncertainty, how much assets do we need to have to make sure we won’t have to work again. Or in other words how much can I withdraw from my assets without depleting the assets too quickly. The Trinity study tries to answer this question. Basically they used historical data to check if withdrawing X% of the initial portfolio each year, this was possible for at least 30 years without being out of money. Within this study a safe withdrawal rate is determined at 3 to 4% of the initial total portfolio value. As we all know results in the past are no guarantee for the future, but we have to work with something.

To be conservative let’s take 3%. That means I need about 33 times my yearly expenses in assets to be able to call myself financially independent. Next step is then how to get there. I don’t think you have this amount of money lying around, at least I don’t. To become financially independent I need to save this amount of money. As said before I’m using the stock market to earn money with money. How much do I need to invest each month?

There is a metric that is often used to answer this question: Savings rate. This is easy to calculate: savings divided by income (after taxes) * 100%. E.g. assume I have an income of €2500 and save €1000 each month. My savings rate would be: 1000 / 2500 * 100% = 40%. Using the savings rate anyone can determine in how many years one would be financially independent.

Savings rate Years left Savings rate Years left
5% 102 55% 21
10% 78 60% 18
15% 65 65% 15
20% 55 70% 13
25% 47 75% 10
30% 41 80% 8
35% 36 85% 6
40% 31 90% 4
45% 28 95% 2
50% 24 100% immediate

Assumptions used in calculations above: There are no current savings, a conservative ROI of 3% per year, the expenses are just as high as they are now, income will not grow or decline and finally there is no inflation. Sure there will be inflation and probably you will get a raise. But let’s keep that stable for simplicity and probably they will offset anyway. Historically the ROI of the stock market is higher than the 3% I used here, that should compensate for inflation as well.

My road to financial independence

For my path to financial independence I calculated how much money I needed after I paid off my mortgage using current expenses. Using a safe withdrawal rate of 3%, multiplying these expenses by 33, I got my target amount. I don’t use any future ROI of my current savings, because that is used to compensate for inflation.

Given these assumptions the numbers for December are:

Savings rate last 12 months 30,6%
Savings rate year to date 30,6%
savings as percentage of financial independence goal 12,1%
years till financial independence using total amount of savings of last 12 months 32

As you can see it takes me longer than the expected 25 years I was speaking about in pervious post. That is mainly because I don’t take any ROI into account for these calculations.

Financial goals 2016

Savings rate 25%
Savings as percentage of financial independence goal 13.8%

For 2016 I have a goal to save 25% of my income. This is much lower than the savings of last year. Main factor to this drop is that I’m going to marry this year. Taking these new savings and adding these to my current stash (ignoring any market changes) I would be at 13.8% of my end goal at the end of 2016.

Review 2015, plans 2016

Has another year really flown by? My Calendar definitely says yes. So it is time to look back at 2015 and make plans for 2016.

2015

Good things I want to continue

As you probably can see from my previous posts I somewhat obsessed by my personal finance situation. At least that is what other people in my environment tell me. For me I just like numbers statistics, in other words I’m a nerd and I’m perfectly happy with that. In the beginning of 2015 I made plans to be financially independent in approximately 25 years. I was able to stick to this plan since I made it. I won’t go in the details here (sounds like worth a post of his own). Some people in my environment tell me this is impossible, maybe this is true but planning for it makes me at least in the less dependent on my job in the future.

In the beginning of the year I had a job quite far from home. I was lucky to find a job with more responsibilities closer to home. Details about this you can find at my LinkedIn profile. I love this job and hope I’m able to keep it the coming year.

Things I want to improve

One thing I totally failed at in 2015 is blogging. If I look at my post list for 2015 it tells me there are exactly 0 posts so far. So the plan for 2016 is to have a monthly post. Not sure what the content will be, but in this post at least I found 1 subject to explore.

In 2015 I was very busy with my job and lots of other stuff that was really important. I totally forgot to sharpen the knife. So in 2016 I need to spend more time and effort on learning new things. This includes professional skills and personal skils.

Yes I was busy and I totally forgot about my photography hobby. I made some nice pictures, but not nearly enough if you look at the fun it brings me.

Plans 2016

So there are some things to improve on as I mentioned in my review of 2015: Blogging at least once a month, learn new things like exploring the topic of machine learning, spend at least 1 day a month making very nice photos

The biggest plan however is to marry my girlfriend. This will not only affect 2016 but probably the rest of my life. After being together for more than 10 years I feel this is the best way forward. This will affect my plans for being financially independent in a negative way and will cost a lot of time planning that could harm my other plans but I think this is totally worth it.

Naive Bayesian Classifier on transactions

For a long time I have the idea to enhance my BudgetApp application with machine learning techniques. I would like to use these techniques to categorize my transactions. Let me first explain how my current solution works. I have a set of rules that are used to assign a category to a transaction. In principle there are two kind of rules: Rules based on the description of a transaction and rules based on the contra account number of the transaction. The first rule that matches the transaction is used to categorize the transaction. The rules engine works quite well. About 90% of all transactions are recurring so these can be categorized automatically. But on the moment you get a new kind of transaction that will be recurring you have to create a new rule so in the near future this will be categorized automatically. My total ruleset now contains about 100 rules. In a few years I collected about 1400 transactions that are categorized with the above mechanism. After trying a implementation of the K-nearest-neighbor algorithm. I finally use a bayesian classifier implementation which works quite well.

So 2 weeks ago I finally decided to try using machine learning techniques. With all the categorized transactions I have a great set of data that I can use for supervised learning, also this can be used to validate my solution. There are a few basic assumptions for my solution. I don’t mind that the chosen solution is not able to classify a transaction but it should have very little incorrect classifications. The tool is used to create a budget and if too many transactions are incorrectly classified, the budget might be incorrect.

One of the easiest classification algoritms to implement is K nearest neigbour. The algoritm might be slow if you have many data points. But my total data set seems to be small (only 1400 entries). The main part of this algorithm is to construct a distance function that determines how similar the new transaction to categorize is compared to all other transactions. Looking at the previous solution I recon that there are two major features of a transaction that are used to categorize a transaction: Description and Contra account. So probably the distance function should use these two things to categorize a transaction. I think that the contra account is a good place to start. Basically each different contra account should indicate another category. While running this on the real data I notice that I got a lot of incorrect categorizations in my set, and more importantly that it takes a few minutes to calculate the category of 10 transactions based on the ‘learned’ input of about 1000 transactions. So the solution is actually pretty slow even on my small data set. Another problem I have is that I can’t think of a way to construct a description distance function. I don’t find it acceptable that a categorization of a dozen transactions takes minutes as the old way only takes seconds. After checking if I could improve my calculations I found I didn’t make any major mistakes. Now I have an extra requirement: the solution should be about as fast as the rules solution.

Back to the drawing board it is. I need to select a different classification algoritm. After spending some time on the wikipedia pages I think a naive bayesian classifier might help me. It is pretty straightforward to implement. Furthermore it seems to be quite effective and efficient for similar problems (like spam filtering). A naive bayesian classifier assumes that an absence or presence of a feature is independent of the presence or absence of another feature. This assumption is often wrong, but in practice it seems to work pretty well. Let’s implement this thing.

Starting with the contra account number. What is the probability that a new transaction has contra account 123 and belongs to category x? I think it should be the number of transactions with contra account 123 and category x in the training set divided by the number of all transactions with contra account 123 I can’t think of a better way to find the probabilities on this feature. After implementing this thing I use about 10% of all known transactions as training set and try to categorize the remaining transactions to validate my solution. The results are really promising: 533 transactions are categorized correctly, 697 couldn’t be categorized based on the contra account number (basically the contra account number for these transactions is zero or not in my training set), and only 11 transactions are classified incorrectly. After some investigation on these incorrect transactions I notice that 10 of them fall in the category “salary” while it should be “expense claim”. As my probability only takes the account number as feature this makes sense. I have far more salary payments than expense claim payments. This problem should be resolved if I add more features to my probability calculations. Probably if I add description the distinction can be made. I might include the transaction amount as well as these amounts are quite far apart.

Implementing the probability of the descriptions is similar to the contra account number: the number of transactions with word y in the description and category x in the training set divided by the number of all transactions with word y in the description. This works only for a single word in the description. To combine multiple words these probabilities should be multiplied. Ok implemented, time to validate again: 912 transaction classified successfully, 208 can’t be classified and 121 classified incorrectly. The incorrect classifications worry me. If I look into the details I see that most of them are classified as groceries but the actual category can be many things e.g. gifts. The problem here is that most incorrect transactions are “pin” transactions as we call that in the Netherlands, transactions paid with my debit card. The transactions are categorized as groceries due to similarities in the description field. They all contain the same words.

At this moment the solution cannot replace the ruleset mainly due to the high number of incorrect classifications. However I want to include other features as well. The amount should fix some problems.

P.S. As a proof that the old system is not ideal as well I found 1 transaction that was incorrectly classified in the old system!

Older posts

© 2017 Mark van Venrooij

Theme by Anders NorenUp ↑