Showing posts with label fts. Show all posts
Showing posts with label fts. Show all posts

This Week I Learned - 2016 Week 50

Last week post or the whole series.

Two more weeks to go before we close the chapter for the year 2016.

Am I a good programmer? Think again, it's not just you but the remaining 99% of developers as well. The key point here is stop comparing, but instead identify the gap, and work out a plan to reduce the gap. This is doable provided that you realize what HN user, socmag advised that "...program if you enjoy it. Everything else will come. Financial reward is a side effect, not a cause.". Otherwise, just do something else. YOLO. Maybe rephrase the question, ask yourself, am I having fun? Need more advice? Read through similar discussions in HN.

Discipline is sustain through repeating a specific behaviour. How can we develop this systematically? Pick and develop ONE habit. Start small and start seriously. Don't overwhelmed yourself. Do it on daily basis and no more zero day. Anything is better than do nothing. Walking is better than sitting at the couch. Running is better than both. While we at it, get consistent sufficient sleep and rest. Try meditation to enhance your awareness and mindfulness as well. Follow the cue/craving -> response -> reward cycle suggested by Charles Duhigg. Do this for 66 days instead of the conventional 21 days. Also, another good write-up on building habit. Good luck!

Learning modern C++? So many resources, unfortunately, mostly are outdated and some still use Turbo C++. It seems C++ also suffered the same fate as Perl, hence, Modern Perl books was written. Then where can we obtain up-to-date information? Arne Metz suggested a few sites to check out for a start.
Follow up on implementing Full-text searching (FTS) in MySQL in week 48. The SQL example given there only suitable for searching exact keyword. What if you want to search partial word matching instead ala SQL LIKE operator? This is possible using asterisk (*) and BOOLEAN MODE modifier. Under this mode, the asterisk (*) is a wildcard operator, as in '%' in SQL. Example as shown.
SELECT name, MATCH(name) AGAINST('bio*' IN BOOLEAN MODE) AS relevance
FROM subjects
WHERE MATCH(name) AGAINST('bio*' IN BOOLEAN MODE)
ORDER BY relevance DESC
LIMIT 10

Something interesting on Git encountered this week. As I was git pulling the source, the unpacking process was so slow and I killed it with Control+C. The issue by doing this is that you've a lot of dangling commits or blob.
$ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (540/540), done.
dangling blob f7c89a6a3fd135a16531bd776ecf04dcc9096cc1
dangling blob c66981a6cc4b877e1fe2064e6423c21831e308b3

To clear these dangling blob, you will need to clean up. Later I found out that unpacking was slow because there are a lot of binary files (graphic files) were added.
$ git gc --prune=all
Counting objects: 953, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (541/541), done.
Writing objects: 100% (953/953), done.
Total 953 (delta 392), reused 498 (delta 204)

Branching is cheap and free for SCM like Git. This is one concept I forgotten and during the last sprint, having two developers working on a single branch maybe not be a good idea unless both of you are can resolve conflicts correctly, do not rebase from master branch until testing, and squash your commit before merging as shown below.
$ git checkout featureX
$ git merge --squash featureX-dev1
$ git commit

Reflection on being a developer after 40. Most of the advices he gave were spot on especially choosing the galaxy (technology stack) wisely and open spaces office is crap. Yes, open collaboration my big foot, open noises would be the right description.

Learning Perl? Need to advance your Perl knowledge? Read Advanced Perl.

This Week I Learned - 2016 Week 48

Last week post or the whole series.

December. How fast the time flies as we're approaching the end of the year 2016. Four more weeks to go and we will embrace the new year 2017. Yet, there is so much more to do here and there.

Full-text search (FTS) support for InnoDB was added in MySQL since version 5.6. While is a welcoming feature, especially those who don't want to use third party search engine like Sphinx or Apache Solr, there are still some default behaviours that you'll need to be aware of. First, there is this minimum and maximum word length to be indexed. By default, the minimum word length is three. If you need to produce results with two word length, consider adjusting the server settings and restart it later. Next, you can enable stop words being indexed. Stop words are common words in the language likes "the", "is', or others. Both settings are discussed here in good details.

Looking for FTS full examples, do look into Gutenberg book searching or song searching implementation. For a quick example, below SQL query is good enough for you to get started using FTS through multiple tables.
SELECT *, 
    MATCH(books.title) AGAINST('$q') as tscore,
    MATCH(authors.authorName) AGAINST('$q') as ascore,
    MATCH(chapters.content) AGAINST('$q') as cscore
FROM books 
LEFT JOIN authors ON books.authorID = authors.authorID 
LEFT JOIN chapters ON books.bookID = chapters.bookID 
WHERE 
    MATCH(books.title) AGAINST('$q')
    OR MATCH(authors.authorName) AGAINST('$q')
    OR MATCH(chapters.content) AGAINST('$q')
ORDER BY (tscore + ascore + cscore) DESC

On SQL. Sometimes the solution was so simple that we have overlook even the basic default feature. If you want to find the unique and maximum row by each group which sorted overall, the direct approach is just use MAX aggregate function. Example as shown below. Another approach is to use user variables, not my preference though.
SELECT t.client_id, MAX(t.points) AS "max"
FROM sessions t
GROUP BY t.client_id 
ORDERY BY MAX(t.points) DESC

Want to retain the order of the SQL query in your `IN()` operator? Use MySQL `FIELD()` function. Example as shown.
SELECT * FROM table ORDER BY FIELD(ID,1,5,4,3);

Perl Advent Calendar 2016 have started. I've mixed feeling regarding the first day post before Christmas. Till today, we still don't have a graphing feature built-in to visualize class relationships for any IDE out there for any less supported languages. And yet, we still needs to rely on Graphviz to visualize it. While Graphviz is an excellent tool, it lacking one crucial feature, better automatic diagram layout, something similar to yWorks' yFiles library.

Which Git workflow should you use? Gitlab have a discussion on different workflows. For me, master branch is always deployable approach seems to work for me. This HN user describes it succintly. Key points are:

1. Master branch is the deployable branch
2. All new features and developed in feature or topic branches.
3. Continiously rebase from master. Read this on resolving rebasing conflicts.
4. Send pull request for reviewing.
5. Testing is done in feature branches after rebasing from master and signed off from reviewer.
6. Feature branch is merged to master with a `--no-ff` (no fast foward). See diagram below.


The Github Flow is have the similar approach but the feature branch is deployed first before merge back to the master branch. The advantage of this approach is that you can always rollback to the master branch. Another variation is the Git flow, more complex with additional develop branch. Trunk-based development is another approach but I used before (another variation). However, I don't like the complexity introduced. Suitable if you have a dedicated release engineer.

Looking back at C++ again and I've no idea what I'm looking at. Why use typedefs to create alias for the default basic types? Portability. Implementing callbacks is a bit tricky, you really need to get over the C++ syntax used. Delegation in C++? Seriously, so many ways? All looks very hackish to me. Where is ? Didn't realize C++ don't have a ISO standard for quite some times. Undefined references? Most likely the order of the files being compiled causing linker problem.

Overwhelmed by front-end development works? The front-end scene is a moving target right now. There even a study plan to cure all these Javascript fatigue.