Saturday, April 28, 2012

Make free calls and SMSs with Viber

I have just installed and tested Viber on my iPhone 3GS. Viber is an application for iPhone and Android phones that lets you make free phone calls and send text messages to anyone who also has the application installed.

Or as the Viber developers describe: You can call or text any Viber user, anywhere in the world, for free. Viber integrates seamlessly with your existing address book, has great sound quality and once activated, does not require a PIN, username or any additional "in application" purchase. All Viber features are 100% free! 

Therefore as long as you have Wi-Fi or a cheap data plan, you can start doing literally free calls and SMSs..spread the word, as the more Viber users, the more people you can contact for free.

Friday, April 27, 2012

Moving Master or Slave nodes in MySQL

In this post I want to discuss with you a concept which simplifies failing-over or migration of a master or slave node in a MySQL replication setup without incurring any downtime to the application - i.e. hot  failover/migration.

But before I need you to read and understand these three axioms which I collected and which I believe are the core understanding to MySQL High Availability Concepts:
  1. In a multi-master, multi-slave replication setup, a slave will only follow its one single adjacent master coordinates.
  2. In a master-master replication, transactions written to master 1 will first propagate to master 2 and then bounced back to master 1 - in the latter route, master 1 will realize that these are its original transactions and ignore them. 
  3. Place and record marker events on both the original and new master for the slave to be moved.
Ok, I made up these three axioms and may seem confusing at first. Let me explain a bit:

In the first rule I am saying that if I have this setup: M1<->M2->S1, then S1 will need the coordinates of M2 to stay in sync. Quite obvious no? Just make sure you never forget it and try to use the coordinates of M1 instead. Also for beginners, remember that a slave cannot have more than one master.

In the second rule I am stating another obvious thing but which few people put it into practical use. Consider the following setup: M1->M2->S1. How would you go about to convert it to: M1<->M2->S1, i.e. making M1 the slave of M2 without incurring any downtime whatsoever on any of the nodes? This is pretty simple if you read again this concept - if at any point in time I read the M2's master coordinates (SHOW MASTER STATUS on M2) and request M1 to be a slave of M2 using these coordinates, any transactions which have been already propagated from M1 to M2, will be ignored when they come back to M1. I will write another article to illustrate this with a detailed example.

The third rule is the concept I would like you to understand in this article - if you are able to understand it, then you can do almost all MySQL migrations without ever incurring any downtime on the setup. The problem when moving nodes is to make sure they remain consistent - so we need to know exactly when the replication stopped and then at which point within the new master log file we have to tell the slave to resume replication. This can be easily done by using Marker Events. Marker Events is just a technique used in many replication setups (not just MySQL) and it involves creating a fictitious table on the master, taking note of the coordinates within the bin log, and then ask the slave to stop when this table is created on its end. Then you can resume the slave with the previously recorded coordinates.

The best way to understand the 3rd axiom, Marker Events, is to look at this example:


Suppose we want to migrate the setup on the left hand side to the setup on the right hand side. This is a very common migration in the real world where we are requested to convert from a chained replication to a star replication setup (the inverse is also common). To do this, we need to follow these steps:

Step 1 - Stop Slaving C
On node C:
STOP SLAVE

Step 2 - Create and Record Marker Events
On node A create a marker table and determine its coordinates from the bin log events:
CREATE TABLE mydb.marker1 (c1 SMALLINT);
SHOW BINLOG EVENTS;

Call these coordinates (fileA`, posA`).

On node B determine the coordinates from the bin log events:
SHOW BINLOG EVENTS;

Call these coordinates (fileB`, posB`).

Step 3 - Sync C to the Marker Event on B
On node C:
START SLAVE UNTIL
    MASTER_LOG_FILE=fileB`, 
    MASTER_LOG_POS=posB`;

Node C will stop exactly when the marker table is created. 

Step 4 - Point C to Marker Event on A and start slave
On node C:
CHANGE MASTER TO
    MASTER_HOST=hostA,
    MASTER_LOG_FILE=fileA`, 
    MASTER_LOG_POS=posA`;
START SLAVE;

Since we know that C stopped at the Marker, we can pretty much resume at that stage! This is recorded as (fileA`, posA`) on node A! As you can see, as long as we setup and record marker events on the original and new master, we are able to seamlessly move slaves around with no required downtime. As an exercise I would recommend you to try doing the inverse (from star to chain) using the same technique!

As a side note, I would like to personally thank Rene Cannao` for introducing me to these concepts.

Wednesday, April 25, 2012

How Tables are arranged in JOINS

I often get asked how does an optimizer (being in Oracle or MySQL) decide which would be the outer table in a join. Consider the following three tables: T1 (100 rows), T2 (10 rows) and T3 (50000 rows).

If I have a query which joins T1->T2->T3 the optimizer will not join T1 with T2 and then joining the result set with T3. Instead, it will choose the table with the lowest number of rows, in this case T2, and then use it as the outermost table.

The join will then be T2->T1->T3 (note that the order T1->T3 or T3->T1 will not make a difference if they are related to each other as I will explain later on).

But why does the optimizer take the smallest table as the outer table? Very simple if you think about it. In a join, there will always be a full scan of all the rows in the outermost table because the query needs to look at all the rows in this table and the result will be used as a filter to the next table. Thus you can easily optimize the inner tables by using indexes on the joined columns. However the full scan on the outermost table is inevitable, so the optimizer decides to use the one with the least amount of rows.

In some cases the outermost table would still be very large. What happens in this case? In MySQL you can try to use a covering index to include all the columns referenced in the query for the outermost table - in this manner MySQL will do a full index scan (Type: Index in the Explain Plan) rather than retrieving the data from the table.

What does this mean in practical query tuning? This means that before you start implementing indexes on every column of all the joined tables, first you need to check the explain plan and the order of the join as per the optimizer. Then start your analysis and implement indexes accordingly.

Efficient Random Rows in MySQL

MySQL's official documentation dictates the use of the ORDER BY RAND() (coupled with LIMIT if you want) to select one or more random rows from a table. This is useful for small tables but as the number of rows increase, you will notice that this will be very inefficient due to the use of temporary tables (which are I/O bound) and filesorts.

I came up with two efficients solutions - one way is to use an external script or language and generate a random value and selecting this value from the id column (which is normally a clustered index) of the table. Fairly easy and I won't discuss it here unless asked. What I want to show you in this article is how to use a total MySQL solution to efficiently retrieve a random row.

The idea is as follows:

  1. Generate a random number from 0.1 to 0.9
  2. Map this random number to a table id
  3. Select the row whose id is generated from previous step
In SQL this would be:
SELECT @r:=RAND();
SELECT @i:=
 (SELECT MIN(id) FROM table1 WHERE id>=
  (SELECT (@r*(SELECT MAX(id) FROM table1))));
SELECT * FROM table1 WHERE id=@i;

Although it takes three steps to generate the random row, this is very efficient and fast.

Monday, April 23, 2012

Choosing the ideal InnoDB log file size

I refer to an incredible post by Baron Schwartz where he wittingly explains a technique to choose the best InnoDB log file size. The idea is so monitor the growth rate of the log file for a minute and then adjust the size such that it can hold at most one hour of changes. Let me explain with an example:
mysql> pager grep sequence
PAGER set to 'grep sequence'
mysql> show engine innodb status\G select sleep(60); show engine innodb status\G
Log sequence number 84 1028230
1 row in set (0.06 sec)

1 row in set (1 min 0.00 sec)

Log sequence number 84 2028230
1 row in set (0.05 sec) 

The log sequence number is the total number of bytes written in the transaction log. Therefore the delta should indicate the growth rate per second:

mysql> select (2028230 - 1028230) / 1024 / 1024 as MB_per_min;
+------------+
| MB_per_min |
+------------+
| 0.95367432 |
+------------+

In an hour, the growth will therefore be 57MB, rounded to 64MB. Since we have two logfiles by default, they can be 32MB each. Sounds small, but in reality that is the optimum size for this particular server. Baron mentions a pretty busy server which only requires two logfiles each being 64MB!

Dubstep, MySQL and Beer

How many of you study or work with dubstep running in the background? Can't wait to finish this MySQL studying in a week or so until I start working on producing some Dubstep shit.

Show a hierarchy list of Linux processes

I just found a cool trick thanks to a colleague of mine David Saliba, to cleverly use the Linux ps tool to display a list of processes in a hierarchical order by using the fax flags (and it's easy to remember!):
[root@db10 ~]# ps fax | less
12778 ?        S      0:00 vncconfig -iconic
12779 ?        S      2:19 xterm -geometry 80x24+10+10 -ls db10:1
12782 pts/2    Ss+    0:00  \_ -bash
15101 pts/2    S     14:10      \_ xterm
15103 pts/3    Ss     0:00      |   \_ bash
15121 pts/3    S+   153:10      |       \_ top
15123 pts/2    S      2:01      \_ xterm
15125 pts/4    Ss     0:00          \_ bash
15143 pts/4    S+     1:12              \_ tail -f alert_dbp10.log
12780 ?        S      0:00 twm
32399 ?        S      1:31 xterm
32401 pts/5    Ss+    0:00  \_ bash
Also, if you pipe the command to less you can actually scroll sideways to see the whole command string if it does not fit completely in the screen.
[root@tst-str-dbp10 ~]# ps fax | less

Oracle measure Transactions per Second (TPS)

In a previous post I described how I was tasked to produce a number of KPIs and queried Oracle's AWR to get such data. I was also asked to get the Transactions per Second that our Oracle Cluster produces on a weekly basis - the requirement was to aggregate this information on all the three nodes of the cluster rather then displaying the TPS per cluster. This was accomplished as follows:
WITH hist_snaps
     AS (SELECT instance_number,
                snap_id,
                round(begin_interval_time,'MI') datetime,
                  (  begin_interval_time
                   + 0
                   - LAG (
                        begin_interval_time + 0)
                     OVER (PARTITION BY dbid, instance_number
                           ORDER BY snap_id))
                * 86400      -- seconds per day
                   diff_time -- calculate difference in time from 
                             -- previous snapshot
                   FROM dba_hist_snapshot,
     hist_stats
     AS (SELECT dbid,
                instance_number,
                snap_id,
                stat_name,
                  VALUE
                - LAG (
                     VALUE)
                  OVER (PARTITION BY dbid,instance_number,stat_name
                        ORDER BY snap_id)
                  delta_value -- difference in value from previous 
                              -- snapshot for each stat_name
           FROM dba_hist_sysstat
          WHERE stat_name IN ('user commits', 'user rollbacks'))
  SELECT datetime,
         ROUND (SUM (delta_value) / 3600, 2) "Transactions/s"
     FROM hist_snaps sn, hist_stats st
   WHERE     st.instance_number = sn.instance_number
         AND st.snap_id = sn.snap_id
         AND diff_time IS NOT NULL
GROUP BY datetime
ORDER BY 1 desc;

You can then graph the values as in this example showing the TPS over a one week trend:

Oracle measure number of Active Sessions

I was recently asked to produce some KPIs for our three node Oracle cluster database. One of the KPIs was to display a graph trend for total number of active sessions among all the cluster nodes sampled every minute. This can be done by querying Oracle's Automatic Workload Repository (AWR):
     select
        round(sample_time,'MI'),
        count(*) as active_sessions
     from
        dba_hist_active_sess_history
     where
        sample_time > sysdate - 7
     group by
        round(sample_time, 'MI')

You can then graph the values as in this example:

Sunday, April 22, 2012

Using Linux screen to be productive

Screen is a full-screen window manager that multiplexes a physical terminal between several processes thus creating a number of virtual terminals. Creating virtual terminals is like having a number of monitors attached to the server and you are connecting and resuming each session every time you connect to them. This is an invaluable tool for system administrators and a more preferred way to resume a session rather than using VNC (especially if you are only using the command line interface).

I will be explaining how to use screen without too much theory and talk.

Starting a new screen session or reattach to existing session:
$ screen -R

List all the screen sessions:
$ screen -ls

Creating a new virtual terminal (from inside the screen session):
Ctrl-a, c

Detaching from the virtual terminal back to the CLI:
Ctrl-a, d

Hop to the next/previous virtual terminal:
Ctrl-a, n and Ctrl-a, p

Hop to a virtual terminal number:
Ctrl-a, 0-9

Kill all virtual terminals:
Ctrl-a, \

Password protect a screen session (lock):
Ctrl-a, x

Help on interactive screen commands:
Ctrl-a, ?

Example of usage of putting a couple of screens on a server, one for top, another for vmstat and another one for iostat:

Main Terminal:
james@earth:~$ screen -R


Virtual Terminal 0:
james@earth:~$ top
[ Press Ctrl-a, c to create a new virtual terminal ]


Virtual Terminal 1:
james@earth:~$ vmstat 5
[ Press Ctrl-a, c to create a new virtual terminal ]


Virtual Terminal 2:
james@earth:~$ iostat 5
[ Press Ctrl-a, c to create a new virtual terminal ]


Hop to the virtual terminal showing top:
[ Press Ctrl-a, 0 to go to terminal 0 ]


Detach from screen and return to main terminal window:
[ Press Ctrl-a, d ]


Display list of screen sessions:
james@earth:~$ screen -ls
There is a screen on:
 2055.pts-0.earth (04/22/2012 09:41:33 AM) (Detached)
1 Socket in /var/run/screen/S-james.

Delete Duplicates fast in MySQL

In an earlier article I explained how to delete duplicate rows in Oracle using the rowid pseudocolumn. In MySQL there is no pseudocolumn, and in order to use the rowid equivalent one has to implement session variables which can get a bit messy and slower when you have large tables.

As such one has to use some imagination and an old technique to remove duplicates. The process is as follows:

  1. Create a similar table to the original one which contains the duplicate values.
  2. Insert distinct values in the new table.
  3. Drop the old table.
  4. Rename the new table as the old one.
Pretty clear, no? In SQL we can implement the first two steps using the CREATE TABLE ... SELECT syntax. Therefore to remove duplicates from t1, you can execute the following code:
CREATE TABLE t2 
SELECT DISTINCT name, surname, country_id FROM t1;
DROP TABLE t1;
ALTER TABLE t2 RENAME t1;

Friday, April 20, 2012

How Covering Indexes work in MySQL

I believe that MySQL has many cool and advanced index types which provide the best performance strategies (if used properly) compared to other RDBMS's. One such feature is called Covering Indexes.

The concept behind covering indexes revolves behind the fact that in MySQL the indexes do not contain just the pointers to the data pages on disk but also the actual data! So an index with keys (id, name) actually would actually contain both columns (id, name) inside the B-Tree leaf nodes.

This means that if a regular index is used in a certain manner, we can get the data directly from the index - such index would be called a covering index. Data retrieval from a covering index is extremely fast!

For the purpose of this discussion I will be using the following table:
CREATE TABLE t1(
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  user_name VARCHAR(20) NOT NULL,
  first_name VARCHAR(30) NOT NULL,
  last_name VARCHAR(30) NOT NULL,
  external_id INT UNSIGNED NOT NULL,
  country_id SMALLINT UNSIGNED NOT NULL,
  PRIMARY KEY(id)
) ENGINE=InnoDB;

, and an index called external_id is added as follows:

ALTER TABLE t1 ADD INDEX (external_id, first_name);

Now just suppose that we have the following query:
select first_name from t1 where external_id=1;

MySQL starts by looking at the filter key(s) and tries to find an index whose leftmost key is in this set. If it finds an index with such a property, MySQL will check the next key in the filter set and tries to match it with the previous index next key, so on and so forth - I am highlighting this to emphasize that MySQL uses the index keys from left to right. I will explain this in more detail in a future article. Back to our example: external_id is in the leftmost position of the index, so this index will be definitely used. On top of that MySQL realizes that the column data to retrieve is also found in the index so it doesn't need to grab it from the data pages. In other words we are lucky because we are using the index as a Covering Index!

Let's check how this is explained in the plan's output:
mysql> explain select first_name from t1 where external_id=1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ref
possible_keys: external_id
          key: external_id
      key_len: 4
          ref: const
         rows: 1
        Extra: Using where; Using index
1 row in set (0.00 sec)
The Using Index value in the Extra attribute means that MySQL is making use of a Covering Index. The (access) Type attribute tells us that MySQL is using a Ref access to get the data from the covering index i.e. it is using a fast index scan. The Key external_id is the name of the covering index. In other words, we are not only retrieving data directly from the covering index, but we're also accessing it through a fast index scan.

But what happens if the query now becomes as following:
select select external_id from t1 where first_name='James';

Although both external_id and first_name are found in the index keys, we are not referencing the index correctly. First of all MySQL tries to find an index whose leftmost key is first_name and it will not find any. So rest assured that we will not see any fast index scan access with this query. However MySQL smartly realizes that both columns used in this query are located in the covering index. Thus it would be more efficient to get these values from the B-Tree leaf nodes (which are probably already in memory, and smaller than the actual table) rather than looking them at the table. So in summary here is what happens:
  1. MySQL looks at the data in the B-Tree (Covering) Index and not the data pages of the table
  2. MySQL does a sequential read rather than an index scan on the index structure
Although the lookup is not as efficient as before as the access is sequential, this lookup is still better than a full table scan. This is depicted by the explain plan below:

mysql> explain select external_id from t1 where first_name='James'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: external_id
      key_len: 36
          ref: NULL
         rows: 1051
        Extra: Using where; Using index
1 row in set (0.00 sec)
The Using Index value in the Extra attribute means that MySQL is making use of a Covering Index. The (access) Type attribute value of Index, which by the way is a bit of a misnomer, tells us that MySQL is doing a full scan of the index tree. The Key external_id shows the name of the covering index. As already stated, this is usually better than a full table scan because the size of the index is usually smaller than the raw data

Wednesday, April 18, 2012

Your odds of winning the Super 5

With all the commotion produced by my work mates to win the Maltese notorious Super 5 lottery I could not help myself but check out what are the odds of winning the EUR700,000 jackpot. These are the rules:
  • You must choose a sequence of 5 numbers correctly to win
  • The lowest number you can choose is 1
  • The highest number you can choose is 42
  • A given number can only be chosen once per try



When you select your 1st number, you have 42 numbers to choose from, and...
    ...a 1 in 42 chance of picking the right one.

When you select your 2nd number, you have 41 numbers to choose from, and...
    ...a 1 in 41 chance of picking the right one.

When you select your 3rd number, you have 40 numbers to choose from, and...
    ...a 1 in 40 chance of picking the right one.

When you select your 4th number, you have 39 numbers to choose from, and...
    ...a 1 in 39 chance of picking the right one.

When you select your 5th number, you have 38 numbers to choose from, and...
    ...a 1 in 38 chance of picking the right one.


That's it! You have a 1 in 850,668 chance of winning the Super 5


Here are some other odds for the sake of comparison:
  • You have about a 1 in 2,000,000 chance of being struck by lightning.
  • A pregnant woman has a 1 in 705,000 chance of giving birth to quadruplets.
  • Someone eating an oyster has a 1 in 12,000 chance of finding a pearl inside of it.

Good luck! And sorry to say that the only way to increase your chances of winning is to buy more tickets! Choosing birthdays, your age, addresses, years, phone numbers, etc. does not help at all!

Monitor Postgresql Load Activity

The best and easiest way to monitor what a Postgres server is doing is by using the pg_top tool. Usage is as follows:

$pg_top -d postgres

301 Redirects on r00tb0x.com

In an SEO attempt to produce backlinks on jamesattard.com while not losing any ranking due to duplicate content, I am explicitly redirecting search engines which crawl on r00tb0x.com to jamesattard.com. The redirection is permanent (301), and this will make sure that the search engine will drop off duplicates.

For your information, the Rewrite Rule I am using is the following:

RewriteCond %{HTTP_HOST} r00tb0x.com [NC]
RewriteCond %{HTTP_USER_AGENT} .*(Googlebot|Bingbot).*
RewriteRule (.*) http://www.jamesattard.com [R=301]

Notice the two chained rewrite conditional statements which have an effect only on Google and Bing bots. To test that this redirection is working on bots but not on humans, I used curl as illustrated in a previous post:
ivory:~ user$ curl -A "Googlebot" r00tb0x.com
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved 
<a href="http://www.jamesattard.com">here</a>.</p>
</body></html>
ivory:~ user$ curl r00tb0x.com
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved 
<a href="http://www.r00tb0x.com/">here</a>.</p>
</body></html>

Test htaccess rewrites with curl

If you are doing massive SEO with 301 redirects and such, I recommend to use Linux curl to test your rules. I have been doing this for the past 3 years now and I am surprised why I have never blogged about it. Curl allows us to specify the user-agent and this would be ideal to simulate search engine crawling to your website if you have made specific rules to redirect search engine bots but not humans.

This simple example illustrates how you can simulate a Google bot crawl against jamesattard.com:

curl -A "Googlebot" www.jamesattard.com

This will show how jamesattard.com will look like to a Google bot.

Tune Blogger for High Speed Performance

Guess what...Blogger/Blogspot blogs can load really slow even if they are hosted by Google! This is not always Google's fault - actually the majority of the time it's your fault and third party widgets and javascript plugins :) I encountered a similar problem in the past days - check the screenshot below which I took from an online tool to check a website's loading time:


It results that my blog takes more than 15 seconds to load! I reran the test a couple of times, and sometimes it also spiked to 20 seconds to load the page! This will put off potential readers and subscribers, or as we say in SEO terms, spiking the bounce rate to more than 80%. By looking at the screenshot above it was evident that the problem was inherent just on my blog and not on the Blogger network, as the server was spending 15 seconds just to display something on the screen which is smaller than 35kB.

Now some people might think that I display more than 10 posts/days of content on my page, or have big images which may slow everything down - the reality is that no! I only display 5 posts, I don't have big images, and the only 3rd party javascript I use is Syntaxhighlighting. So what the heck is the problem?

After some investigation I did the following tuneups, and I achieved amazing results:
  • I put all the 3rd party javascript code (in my case just the SyntaxHighlighter) at the bottom of the blog (just above the </html> tag when you edit the blog's html)
  • I rearranged the widgets (you have to experiment) in such a way that I put the potentially slow widgets at the bottom of the sidebar.
And guess what? Now my page loads in just 1.70s and even less in most cases:


Also note that now the page size indicates that the whole page nearly loaded (651kB) - which indicates that the order of the widgets made a difference, possibly because one of the widgets is buggy!

I hope you try these simple tricks and if you do, post your results in the comment section below.

Tuesday, April 17, 2012

Oracle export with query

If you need to produce an Oracle export with a filter/query, you may encounter syntax errors because you need to escape certain characters. This is the format I had to use:
exp james/pass tables=earth.my_table query=\'where date \> 
to_date\(\'25\-01\-2012\'\, \'dd\-mm\-yyyy'\)\' file=export.dmp

An even simpler solution which will work on all platforms is to put the query in a parfile:
[oracle@moon /]$ cat export.par
query="where date > to_date('25-01-2012', 'dd-mm-yyyy')"

You can then call export as follows:
exp james/pass tables=earth.my_table parfile=export.par file=export.dmp

Formatting Dates in SQL Server

In SQL Server the date formatting is a bit peculiar to say the least...I used the following query to be able to go 30 days back from current date:

select cast(floor(cast(getdate()-30 as float)) as datetime)

If you know the actual date string, you can use it simply as follows:
select * from emp where dat between '2012-04-17' and '2012-04-18'

Monday, April 16, 2012

Hunter and bear's 2012 birthday party

Check out this interactive YouTube clip. One of the coolest ideas I have seen in a long time!


Remember, don't be a party pooper so don't end it :) Some cool years you can try: -2000, 0000, 2050, 1980, 1970, 1940

    List user or table privileges in Oracle

    This morning I was asked by the auditors to give them a list of privileges which have been granted to users USER_A and USER_B. You can do this by using the following SQL:
    select GRANTEE, TABLE_NAME, PRIVILEGE from dba_tab_privs 
    where lower(grantee) in ('user_a', 'user_b') order by grantee;

    Migration from r00tb0x.com is complete

    This weekend I finished migrating all the content from r00tb0x.com (Drupal based) to jamesattard.com (blogger based). I used a mix of scripts and xml RSS hacks which turned out pretty good. Maybe one day I publish a little how-to.

    Anyways this means that you'll start seeing some IT related articles on this blog!

    My home workstation setup

    This is my basic home computer setup - no biggie - a macbook connected to an external flat screen.



    Specs:
    • 2.4GHz Core 2 Duo
    • 2GB SDRAM
    • DVD R/W
    • 100GB SATA HD
    • 13.3" + 21" external screen

    Uses:
    • Linux Lab - 3xVirtualbox virtual machines
    • Music production - Tracktor, VirtualDJ, GarageBand, Logic Pro

    Saturday, April 14, 2012

    r00tb0x.com for sale!

    I have finally taken the decision to take r00tb0x.com down and migrate all content to JamesAttard.com. Although r00tb0x.com has an average of 2000 unique visitors per month, the time and cost to maintain it surpasses my original intention of keeping a humble IT blog.


    With that said I will still write IT related articles but instead of putting them on a separate blog, I will include them on this one. All the past r00tb0x.com articles will be moved here with their original timestamp. For your convenience I will label these posts as r00tb0x.com so you can retrieve them from here.

    If someone is interested in buying the r00tb0x.com domain, you can contact me by email.

    Friday, April 13, 2012

    Consolidating blogs?

    The original idea of opening r00tb0x.com was to place all geek and IT related blog posts onto a separate blog, and leave this webspace for more human friendly words. This year though I seem to be heading to a consolidation life philosophy...This new lifestyle is compelling me to ditch off r00tb0x.com and migrate everything onto here like the old days.

    What do you think guys?

    Thursday, April 12, 2012

    Output table contents in PL/SQL stored procedure

    With Oracle stored procedures the database developer can display output to the SQL client. This is very handy when you want to use a stored procedure to compute certain logic and display the output directly to the user.

    Let's take a simple and common scenario where you want to output the result of a query which joins two tables from a stored procedure. The basic steps are the following:

    1. Store resultset inside a cursor.
    2. Enable dbms_output.
    3. Loop through the cursor.
    4. Use dbms_output to output each record from cursor to screen.

    As you can see, the key functionality of the output is possible through the use of the DBMS_OUTPUT package. The following example procedure illustrates such usage:

    CREATE OR REPLACE PROCEDURE get_category_id (
    bc_text varchar2
    ) IS
     
    CURSOR bc_cur 
       IS
            select b.betcategory_id b_id, l.langdataedit_text l_txt
            from betcategory b
            left join langdataedit l
            on b.betcategory_description = l.langdataedit_id
            where l.lang_id='eng' and lower(l.langdataedit_text) like '%' || lower(bc_text) || '%';
            
     BEGIN
     
     DBMS_OUTPUT.ENABLE (buffer_size => NULL);
       
     FOR bc_rec IN bc_cur LOOP
          DBMS_OUTPUT.PUT_LINE 
             (bc_rec.b_id ||' --> '|| bc_rec.l_txt);
       END LOOP;
     
    END get_category_id;
    /

    Tuesday, April 10, 2012

    MySQL Index Prefix for better performance

    In MySQL you can optimize a query running against a table by choosing the correct index. Furthermore MySQL allows you to index strings as well (text, varchar, binary, etc..) in an efficient way by using part of the field length, known as prefix.

    Suppose we have the following table which is being queried by a varchar(50) field called Name:

    mysql> select * from users where name like 'J%';
    +------+-----------------+---------+
    | id   | name            | country |
    +------+-----------------+---------+
    |    1 | James Attard    |     100 |
    |    2 | James Azzopardi |     100 |
    |    3 | Jameson Gomez   |     200 |
    +------+-----------------+---------+
    3 rows in set (0.00 sec)

    If I put a regular non prefixed index such as the following, the key size will be equal to the full column length (100):

    ALTER TABLE users ADD INDEX (name);

    However if you look closely to the resultset of the query you will notice that we have cardinality with a prefix of 8 bytes: "James A%". Thus if we choose a prefix length of 10 (to be on the safe side) we are going to improve the performance even more as the key will be shorter than the full column length:

    ALTER TABLE users ADD INDEX (name(10));

    Sunday, April 08, 2012

    A phase without wheels

    I would like to document my current status of having none of my three vehicles in a working condition :) My car undergoing a clutch master-cylinder surgery, a motorcycle in the middle of a streetfighter project conversion, and the other motorcycle with an apparent electrical problem.

    We normally have backups, and backups of backups...you know what? If the universe wants you without backups, then all backups will fail!!!

    Wednesday, April 04, 2012

    Using vmstat to detect memory problems in Linux

    Troubleshooting and detecting memory related problems in Linux normally boils down to analysing and eventually determine what is going on within the virtual memory. Linux comes with a handy tool called vmstat which helps us examine the virtual memory.

    Analysing the virtual memory is important in any memory related problem because this works hand in hand with the actual physical memory (RAM). Since physical memory is finite, the kernel normally moves a process or parts of it (called pages) from RAM to disk - the location on the disk is called virtual memory, and the activity of transferring pages or complete processes is called Paging or Swapping.

    Now that we are talking about definitions, let's learn two important definitions:
    1) Swap-In/Page-In: This is when the kernel picks a process or a page of a process from virtual memory to RAM.
    2) Swap-Out/Page-Out: This is when the kernel decommissions a process or a page of it from RAM to virtual memory.

    For this reason when you have a system which is doing constant lots of Swap-Outs it might indicate memory problems because it needs to continously remove processes from the memory in order to make space for the new ones.

    VMStat is normally used with a single parameter which determines the sample rate, the most common one being 5 seconds:

    vmstat 5

    The most important columns are highlighted in the screenshot below - free memory, swap-in (si) and swap-out (so).

    If you see a constant high rate of "so's" and a low "free" memory, you might need to pinpoint the underlying application and consider upgrading the physical memory or shifting the application to a dedicated server.

    Tuesday, April 03, 2012

    Using Heartbeat tables to monitor Database replication Part 1

    Using heartbeats in database tables is a very widely used concept for an application to determine whether two or more database nodes are in sync. Part 1 of this series of articles describes the basic scenario of using a heartbeat table in a simple master slave replication.

    In a simple master->slave replication, an application would normally run reports on the slave node. How can the application ensure that the slave is uptodate? One method is by creating a simple table on the master database called HeartBeat (as an example) and have one TIMESTAMP/DATE field. A job (database job/server job) will run every 10 minutes and update this field with the current timestamp. This table is also replicated to the slave, so both nodes have a HeartBeat table. This is shown in the figure below:

    The application will read the timestamp field from the heartbeat table and compare it to current time. In this way, the application can detect any lag or replication errors. The figure below demonstrates such scenario:

    When the master stops replicating, the HearBeat table on the Master keeps updating itself but this is not reflected onto the Slave. The application will read the date from the Slave's HeartBeat table and detects a lag from the current timestamp and alerts the user.