Showing posts with label sql. Show all posts
Showing posts with label sql. Show all posts

This Week I Learned 2018 - Week 36

Old posts from archive or something from last week.

Why SGI's Onyx was such an attractive and desire visual battle station during the 90's? (via HN) Yes, the size was overwhelming and looks like a mini fridge (or an espresso machine) but its graphical processing was the best during that period. Looking back through their history, all SGI's machines were always powerful but damn expensive. Sadly they should have created a consumer version instead of focusing on high end market which leads to its slow demise.


Is Nikon Z6/Z7 worth it? No, pricing and features (compare to Sony equivalents) and limited lens available making it a less suitable choice. Unless you're professional or prosumer photographer who prefers Nikon's ergonomic body and wants to reuse all your existing Nikon lens collection (however, autofocus only works with AF-S or AF-P lens). Otherwise, for normal hobbyist, maybe it's wise to wait for budget or entry level models like Z3/Z4/Z5 or something from Canon to release it's own mirrorless system to join the competition. Maybe you should just stick with your mobile camera, you don't need the latest greatest camera anyway.

Do you have any good advice on writing? Write fast, throw away everything, and take the conclusion as introduction.
"My English teacher used to say “when you have to write an essay, write a first version. Then throw away everything but your conclusion and use that as the introduction of your real essay"
What is the one thing you should be aware when using React Router? Check the version of the installed package first. There were some differences on how the router rules work between version 4 onward and before version 3. I was stuck with a router problem for a while before I realized that I'm looking at legacy documentation.

How to sort NULL column at the end of the queried result in SQL? The simple answer (which I always forgot) is shown below. This is compatible with both PostgreSQL and MySQL.
ORDER BY date IS NULL, date DESC

Why we need to use generics in TypeScript instead of `any` type? To quote from the official documentation (emphasis added).
While using any is certainly generic in that it will cause the function to accept any and all types for the type of arg, we actually are losing the information about what that type was when the function returns. If we passed in a number, the only information we have is that any type could be returned.

Instead, we need a way of capturing the type of the argument in such a way that we can also use it to denote what is being returned. Here, we will use a type variable, a special kind of variable that works on types rather than values.
In short, we want to know the return type of a generic function.

This Week I Learned 2018 - Week 19

Last week post or something else instead.

Surprised and shocked. The only two words I can think off to describe my feeling for the past week. First, surprised because this one particular item on my bucket list which I guestimate will took me next 20-plus years or never to check off have been accomplished this week. Secondly, shocked it came so soon and such a massive shift and the non-stop flair of dramas. The staycation was well-needed as I was still recovering from my injury. Enough said, I've done my part and moved on to next item on my list.

Infestation of the Blue Green Algae (BGA) is getting worse and we have to do almost daily water changes on certain aquarium tanks. Prevention method is quite straightforward, use blackout method. First, 30% to 50% water changes, add air stone, cover tank for 3 to 4 days, and did another 30% to 50% water changes again.

Some MySQL stuff I've encountered recently.

When creating a MySQL stored function, do remember if the function does not modify data, you will need to set extra explicit options for binary logging.
ERROR 1418 (HY000): This function has none of DETERMINISTIC, NO SQL,
or READS SQL DATA in its declaration and binary logging is enabled
(you *might* want to use the less safe log_bin_trust_function_creators
variable)

The sample SQL statement as follows.
CREATE FUNCTION f1(i INT)
RETURNS INT
DETERMINISTIC
READS SQL DATA
BEGIN
  RETURN i;
END;

The limitation or restriction of subquery in MySQL. It doesn't supports `LIMIT` in subquery yet? Encountered this error message. To work around, you've to rewrite you SQL query using `JOIN` instead.
SQL Error (1235): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'

Can't delete using subquery? Wrap your subquery in additional subquery.
DELETE e.*
FROM tableE e
WHERE id IN (SELECT id
    FROM (SELECT id
        FROM tableE
        WHERE arg = 1 AND foo = 'bar'
    ) x
);


This Week I Learned 2018 - Week 17

Last week post or something from the past instead.

Humour aside, I've known two persons (a tech guy and a lecturer) who don't even have a mobile phone, let alone social media accounts. Yes, they are on the rather extreme cases but that was due to their own personal stands and life philosophy. While there have been quite a lot of minimalist life style movement these days but little emphasis on digital world. As usual, moderation is the key here. Too much of everything will cause unwanted disturbance and distraction to your personal and working life.


The Playoff Rondo is part stats and part myth although the stats don't tell the full story. But it's quite entertaining to watch him play, especially the playoff and with big centers.

Plan B before you turn 40. (via HN) Relevantly, experiences on those over 50 who have been laid-off from the tech-force seemed quite positive, life still goes on although on different path. Well there are differences different between Western and Asia countries, this is just part of your work life, especially for those working in tech. Practice minimalist life style, live below your means, exercise and keep healthy, and do what you love. You have to agree with, web development is so accessible these days. Anyone good basic knowledge of English and keen persistence, they can build up the system fairly quite fast. Furthermore, as you aged, priority in life changed as well as your health. Moreover, older tech guy tends to question more and call out BS.

Where SCRUM is not the right choice. (via Reddit) The discussion in the Reddit tells a lot of the horror or abuse stories where SCRUM was implemented literally. It does not work for software team with different project and don't share the SAME backlog. If you have different products, you can't implement SCRUM within the same backlog! Period.

Friendship’s Dark Side: ‘We Need a Common Enemy’. It's a common knowledge but right now it have been proven that "Xenophobia and in-group bias go hand-in-hand". Average fall-outs withing the social group is around 7-plus months. Interesting read.

One item checked off the bucket list. More items to go.
$ cpanm Webservice::Swapi
--> Working on Webservice::Swapi
Fetching http://www.cpan.org/authors/id/K/KI/KIANMENG/Webservice-Swapi-0.1.1.tar.gz ... OK
Configuring Webservice-Swapi-0.1.1 ... OK
Building and testing Webservice-Swapi-0.1.1 ... OK
Successfully installed Webservice-Swapi-0.1.1 (upgraded from 0.1.0)
1 distribution installed

Select all parent and child records. This is when having a Venn diagram really help in visualizing the details.
SELECT ...
FROM comments AS parent
LEFT JOIN comments AS child 
ON child.parent_id = parent.id
WHERE parent.parent_id IS NULL
ORDER BY parent.id, child.id;

This Week I Learned 2018 - Week 16

Last week post or something from the archive.

看了一套墨西哥的戏,《行走距离》。电影讲述一个三百公斤的胖子,寄托于摄影以解生活的苦闷。间接中,也结交了几位好友,让他生命有所改善。没有惯性的情节,简单不过及现实的故事。健康是宝,平淡是福。

Another week of learning new English vocabulary. One issue with making sentences using these words was that the sentences was either too forceful or too literally (直译的) translated.
Sandy alleviates (缓解) the excruciating pain on her lower body part by elevating both legs into a bucket with warm water. By doing so, they hope it can obliterate (忘却) the pain she was going through. Both legs were swelling due to her falling into liquescent (易液化的) soil and stabbed by tree trunk while trekking in the wood. The swelling have worsen and everyone worry that the muscle may liquefy (使液化) from within. 

Stuck with an interesting MySQL problem, which was quite easy but I can't seem to get it right after several tries. Basically we want to update certain column in t1 to the maximum value from a group of similar records. The SQL query was quite straightforward, just join with the second table, t2 with maximum record with the first table, t1.
UPDATE t_transaction tu
JOIN (
    SELECT code, MAX(flag) AS flag
    FROM t_transaction
    GROUP BY code
) t
ON tu.code = t.code
SET tu.flag = t.flag

Modern backend developer in 2018? (via HN) A checklist and decision making tree are always a good choice to make good decision making when picking up new technology stack.

Perl constant array. Sometimes the simplest and easiest way is the only way instead of going through and abusing quirky Perl syntax.
use constant FRUITS => ['apple', 'orange', 'banana'];

my $fruit = 'apple';
if (!grep $fruit eq $_, @{+FRUITS}) {
    print "Is not a fruit";
} else {
    print "Is a fruit";
}

use constant FRUITS => qw(apple orange banana); 
use constant FRUITS => (apple orange banana);  
my $fruit = 'apple';
if (!grep $fruit eq $_, FRUITS) {
    print "Is not a fruit";
} else {
    print "Is a fruit";
}

Modern PHP without a framework. (via HN) Nah, still hate it after all these years. Maybe I'm just prefer a more prettier programming language.

回顾自己的工作生涯,真的必须自我检讨未来的工作方向,看看那些是自己重视的
所以,你得先釐清有哪些條件是你重視的,像薪水、公司品牌、職位、工作內容等,一一把它寫出來。列出條件後,再排定先後順序,之後,和你手邊的 offer(拿到手的工作)比較,做成表格。縱軸是你重視的條件,並依照先後順序,由上而下排列;橫軸則是你拿到的 offer。當你完成表格後,該選擇的工作就會自動「跳出來」。
當我們在思考應該向哪家公司投遞履歷時,其實已經在篩選公司了,但如果是以「錢多事少離家近」為標準,長遠而言卻未必是最適合自己的公司。所謂「知己知彼,百戰百勝」,此時應該要先釐清自己的需求,包括期望的上下班時間、喜歡的工作環境、想做內勤或外務、理想的管理方式、工作如何分配、會議如何進行、同事如何相處、以及有哪些學習機會等。再依這些喜好排出先後順序,然後評估要應徵哪些公司。
工作上要留下那个”人才“? 当然是人品态度优先,能力可以慢慢去培养。

What are some tech companies that do not use open floor plan? Interesting question for those who are looking to justify cubicles. The office at Fog Creek was in between, open space for collaboration and private office for working.

Data class in Python 3.7. (via HN) What took them so long? Syntactic sugar or not, this should have been in any fundamental data type for any programming that supports OOP.

Vipassana for Hackers. (via HN) A book on how a tech guy perceived and practiced Vipassana meditation without involving religion, just the practice. Contrary to typical meditation book, lots of pictures explaining the core concepts.

Moving for good. (via HN) Good write-up when you're moving to a new place. Key quote here (emphasis added),
Ask lots of questions. Ask them to explain things, and show you how it’s done. When they state a fact, ask how they know. When they state an opinion, ask for examples.
How to be a systems thinker. (via HN) I still can't get my head to understand this and the list of reading materials is quite long. Worth spend time on this? We shall see then.

This Week I Learned - 2016 Week 50

Last week post or the whole series.

Two more weeks to go before we close the chapter for the year 2016.

Am I a good programmer? Think again, it's not just you but the remaining 99% of developers as well. The key point here is stop comparing, but instead identify the gap, and work out a plan to reduce the gap. This is doable provided that you realize what HN user, socmag advised that "...program if you enjoy it. Everything else will come. Financial reward is a side effect, not a cause.". Otherwise, just do something else. YOLO. Maybe rephrase the question, ask yourself, am I having fun? Need more advice? Read through similar discussions in HN.

Discipline is sustain through repeating a specific behaviour. How can we develop this systematically? Pick and develop ONE habit. Start small and start seriously. Don't overwhelmed yourself. Do it on daily basis and no more zero day. Anything is better than do nothing. Walking is better than sitting at the couch. Running is better than both. While we at it, get consistent sufficient sleep and rest. Try meditation to enhance your awareness and mindfulness as well. Follow the cue/craving -> response -> reward cycle suggested by Charles Duhigg. Do this for 66 days instead of the conventional 21 days. Also, another good write-up on building habit. Good luck!

Learning modern C++? So many resources, unfortunately, mostly are outdated and some still use Turbo C++. It seems C++ also suffered the same fate as Perl, hence, Modern Perl books was written. Then where can we obtain up-to-date information? Arne Metz suggested a few sites to check out for a start.
Follow up on implementing Full-text searching (FTS) in MySQL in week 48. The SQL example given there only suitable for searching exact keyword. What if you want to search partial word matching instead ala SQL LIKE operator? This is possible using asterisk (*) and BOOLEAN MODE modifier. Under this mode, the asterisk (*) is a wildcard operator, as in '%' in SQL. Example as shown.
SELECT name, MATCH(name) AGAINST('bio*' IN BOOLEAN MODE) AS relevance
FROM subjects
WHERE MATCH(name) AGAINST('bio*' IN BOOLEAN MODE)
ORDER BY relevance DESC
LIMIT 10

Something interesting on Git encountered this week. As I was git pulling the source, the unpacking process was so slow and I killed it with Control+C. The issue by doing this is that you've a lot of dangling commits or blob.
$ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (540/540), done.
dangling blob f7c89a6a3fd135a16531bd776ecf04dcc9096cc1
dangling blob c66981a6cc4b877e1fe2064e6423c21831e308b3

To clear these dangling blob, you will need to clean up. Later I found out that unpacking was slow because there are a lot of binary files (graphic files) were added.
$ git gc --prune=all
Counting objects: 953, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (541/541), done.
Writing objects: 100% (953/953), done.
Total 953 (delta 392), reused 498 (delta 204)

Branching is cheap and free for SCM like Git. This is one concept I forgotten and during the last sprint, having two developers working on a single branch maybe not be a good idea unless both of you are can resolve conflicts correctly, do not rebase from master branch until testing, and squash your commit before merging as shown below.
$ git checkout featureX
$ git merge --squash featureX-dev1
$ git commit

Reflection on being a developer after 40. Most of the advices he gave were spot on especially choosing the galaxy (technology stack) wisely and open spaces office is crap. Yes, open collaboration my big foot, open noises would be the right description.

Learning Perl? Need to advance your Perl knowledge? Read Advanced Perl.

This Week I Learned - 2016 Week 48

Last week post or the whole series.

December. How fast the time flies as we're approaching the end of the year 2016. Four more weeks to go and we will embrace the new year 2017. Yet, there is so much more to do here and there.

Full-text search (FTS) support for InnoDB was added in MySQL since version 5.6. While is a welcoming feature, especially those who don't want to use third party search engine like Sphinx or Apache Solr, there are still some default behaviours that you'll need to be aware of. First, there is this minimum and maximum word length to be indexed. By default, the minimum word length is three. If you need to produce results with two word length, consider adjusting the server settings and restart it later. Next, you can enable stop words being indexed. Stop words are common words in the language likes "the", "is', or others. Both settings are discussed here in good details.

Looking for FTS full examples, do look into Gutenberg book searching or song searching implementation. For a quick example, below SQL query is good enough for you to get started using FTS through multiple tables.
SELECT *, 
    MATCH(books.title) AGAINST('$q') as tscore,
    MATCH(authors.authorName) AGAINST('$q') as ascore,
    MATCH(chapters.content) AGAINST('$q') as cscore
FROM books 
LEFT JOIN authors ON books.authorID = authors.authorID 
LEFT JOIN chapters ON books.bookID = chapters.bookID 
WHERE 
    MATCH(books.title) AGAINST('$q')
    OR MATCH(authors.authorName) AGAINST('$q')
    OR MATCH(chapters.content) AGAINST('$q')
ORDER BY (tscore + ascore + cscore) DESC

On SQL. Sometimes the solution was so simple that we have overlook even the basic default feature. If you want to find the unique and maximum row by each group which sorted overall, the direct approach is just use MAX aggregate function. Example as shown below. Another approach is to use user variables, not my preference though.
SELECT t.client_id, MAX(t.points) AS "max"
FROM sessions t
GROUP BY t.client_id 
ORDERY BY MAX(t.points) DESC

Want to retain the order of the SQL query in your `IN()` operator? Use MySQL `FIELD()` function. Example as shown.
SELECT * FROM table ORDER BY FIELD(ID,1,5,4,3);

Perl Advent Calendar 2016 have started. I've mixed feeling regarding the first day post before Christmas. Till today, we still don't have a graphing feature built-in to visualize class relationships for any IDE out there for any less supported languages. And yet, we still needs to rely on Graphviz to visualize it. While Graphviz is an excellent tool, it lacking one crucial feature, better automatic diagram layout, something similar to yWorks' yFiles library.

Which Git workflow should you use? Gitlab have a discussion on different workflows. For me, master branch is always deployable approach seems to work for me. This HN user describes it succintly. Key points are:

1. Master branch is the deployable branch
2. All new features and developed in feature or topic branches.
3. Continiously rebase from master. Read this on resolving rebasing conflicts.
4. Send pull request for reviewing.
5. Testing is done in feature branches after rebasing from master and signed off from reviewer.
6. Feature branch is merged to master with a `--no-ff` (no fast foward). See diagram below.


The Github Flow is have the similar approach but the feature branch is deployed first before merge back to the master branch. The advantage of this approach is that you can always rollback to the master branch. Another variation is the Git flow, more complex with additional develop branch. Trunk-based development is another approach but I used before (another variation). However, I don't like the complexity introduced. Suitable if you have a dedicated release engineer.

Looking back at C++ again and I've no idea what I'm looking at. Why use typedefs to create alias for the default basic types? Portability. Implementing callbacks is a bit tricky, you really need to get over the C++ syntax used. Delegation in C++? Seriously, so many ways? All looks very hackish to me. Where is ? Didn't realize C++ don't have a ISO standard for quite some times. Undefined references? Most likely the order of the files being compiled causing linker problem.

Overwhelmed by front-end development works? The front-end scene is a moving target right now. There even a study plan to cure all these Javascript fatigue.

This Week I Learned - 2016 Week 43

Last week post or the whole series.

This is one of those week where I become very wordy. ;-)

Previous post was too long when I realized that Blogger don't let me add more than 20 tags. That threshold should be a good indicator for me to stop jotting down the item and move to next week post. Some of these notes below are old items which I haven't have time (lack of discipline) to jot it down in a timely manner (lack of consistency). Don't sweat on this. Reflect, learn, adapt, adjust, monitor, and continue as usual. Although there are some minor hiccups, there is still like 9 weeks to go for this year.

Generating and attached iCal invite through email? It may seems simple as first, but, as usual, it may works for one email client but not for others (we haven't take into consideration of desktop, mobile, and web client). Furthermore, there is iCal and vCal, both are VCALENDAR standard where the former is succesor of the later. Digging deeper, it seems that different email client behaves differently when come to REQUEST METHOD used. The typical troubleshooting procedure is first, make sure you can attach send the iCal correctly through email properly using the email library (using MIME::Lite in my case). Next, read other people code on implementing similar solution. You can may miss out certain API calls or use the library incorrectly.

Testing in Perl. You can go through these articles for a start before you jump in to Test Drive Deveopment (TDD) best practices. Some of the issues I've encountered. First, to compare two different different ordered arrays, `cmp_bag` in Test::Deep is your friend. Second, To simulate delay, instead of using `sleep`, you can use Test::Mocktime's `set_relative_time` to do so. But off course, you will need to learn more tricks about the default standard testing module in Perl, Test::More. Writing a new Perl's module, integrate testing as early as possible

While we on testing. To benchmark the speed of your SQL query, you can disable MySQL query cache. This ensure that no caching is used which may give the wrong impression regarding the speed of the query.

While this is quite old news, two features in Perl 5.10 which I think make a lot of senses. The `defined or` is useful to check if a variable is defined or else use the default value (more examples). And the `state` for declaring a variable as static variable. Useful when you want to implement localized caching.

On Perl again. Check if a substring exists in a string without using Regex through using `index`. I wish Perl have more Object-Oriented way of doing so. Maybe Perl 6 have it?

Note to self. This is very useful. Several alternative ways to monitoring event in Javascript, useful for debugging. My favourite is using the `monitorEvents($0)` method. So simple and straightforward.


While we're on Javascript. Developing a web application with heavy AJAX usages? Well, you have to use `history.pushState` somehow. Why not just switch to Turbolinks?

Storing hierarchical data in database? Adjacent List is the most common and first to use approach when modelling hierarchical tree. However, Bill Karwin's presentation and example and follow up example convinces me of using Closure Table or Bridge Table may be the balance and right approach. This is useful when you're using database system like MySQL that does not supports hierarchical and recursive SQL queries natively.


This Week I Learned - 2016 Week 32

Last week post or you might want to check out the whole series.

Slow week, nothing much being jot down for this week. In my process of becoming more minimalist, I've reduced a lot of "noises" in my life. This shall continue until I can reach the bare minimum necessity in my life, although this will be a life long ongoing journey.

Always use a higher-precedence operator like '||', '&&', or '^' in your return statement. Otherwise, the other expressions will be skipped. There is a Perl module, ProhibitReturnOr that does this checking.

Don't use OFFSET in your pagination SQL query but using keyset pagination. The basic concept is shown in the SQL query below. Never occurs to me to use WHERE instead of OFFSET. As I usually said, it's always nice to get more exposure with other DBMS instead of just MySQL. Further discussion via HN.
SELECT ...
  FROM ...
 WHERE ...
   AND id < ?last_seen_id
 ORDER BY id DESC
 FETCH FIRST 10 ROWS ONLY



Greatest N per Group SQL Problem

Few days back, I encountered a problem where I need to find the last update date for a pricing and I seemed to stuck at the SQL query. Initially I was thinking about using subquery, an old habit of using PostgreSQL for many years. But as you know, this going to be tricky as MySQL is always kind of dog slow at subquery.

Some googling lead me to this simple answer that I realized I had over-analyzed the problem. The question was so common that the Stack Overflow community created a special greatest-n-per-group tag for it.

Let's illustrate this with an example. Imagine you have a employee table with a schema as follows:
employee
- emp_id
- dept
- salary

How do you find the highest paid employee in each department ? The answer was so simple, just use MAX and GROUP BY. The answer in SQL as shown which is so generic and should works with any database system without any optimization.
SELECT emp_id, MAX(salary)
FROM employee
GROUP BY dept

Yes, that all. No complex subquery or joins. Duh ! What was I thinking ?

Find Top-Level Domain (TLD) With MySQL

I was stuck with a problem of extracting Top-Level Domain (TLD) from a column in MySQL database. TLD is the last part of the domain name, example is the .com in google.com. As TLD varies in size, you may have you have .io (popular with startup), .com, or .name, how to you extract this part out without using any code but only SQL? Why SQL? Because it’s easier to group domain by TLD using GROUP BY clause.

Luckily, MySQL’s SUBSTRING_INDEX function can solve this problem easily. As the function name implies, extract a subset of a string into an index. First, let’s look at the function definition and parameters.

SUBSTRING_INDEX(str, delim, count)
  • str, the string we want to perform the action
  • delim, the delimiter we want to break the string into an index
  • count, maximum number of item from left (if positive), or from right (if negative) return

Examples:
mysql > select substring_index('111.222.333.444.555', '.', '1');
-> 111

mysql > select substring_index('111.222.333.444.555', '.', '2');
-> 111.222

Note : Negative count return result from the right
mysql > select substring_index('111.222.333.444.555', '.', '-1');
-> 555

mysql > select substring_index('111.222.333.444.555', '.', '-1');
-> 444.555

Back to my original question, how do we extract different TLD from a domain name?
mysql > select substring_index('www.google.io', '.', '-1');
-> io

mysql > select substring_index('www.google.com', '.', '-1');
-> com

mysql > select substring_index('www.google.name', '.', '-1');
-> name

Unfortunately, this only work with first level TLD and not second or later level like www.yahoo.co.jp where the TLD is co.jp.

VARCHAR and TEXT in PostgreSQL

There is not performance between varchar and text datatype in PostgreSQL. The use of varchar and text are interchangeable. To quote the PG doc (bold emphasis by me.),
"Tip: There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs. In most situations text or character varying should be used instead."
Further discussion at Reddit.