Posts

Showing posts from 2013

Use 16GB SSD for Swap on Amazon Linux c3.large Instance

Image
Amazon announced new generation C3 instance types , which are compute optimized instances, available in 5 sizes: c3.large, c3.xlarge, c3.2xlarge, c3.4xlarge and c3.8xlarge with 2, 4, 8, 16 and 32 vCPUs respectively. C3 instances will provide you with the highest performance processors and the lowest price/compute performance compared to all other Amazon EC2 instances. C3 instances also feature Enhanced Networking and SSD-based instance storage . For C3 Instances, each vCPU is a hardware hyperthread from 2.8 GHz Intel Xeon E5-2680v2 (Ivy Bridge) processors. Setting up c3.large instance with SSDs When setting up instance, make sure you add both Instance Store volumes. Its up to you how you like to set up your root storage, For this example I opted for 16GB, 480 provisioned IOPS (you need to maintain 30:1 ratio): Testing the SSD speed Once set up and server launched, we can use 2nd 16GB SSD (mounted on /dev/sdc) for swap on new Amazon Linux instance c3.large # Test SS

HowTo: MongoDB Tailable Cursors in Node.js

Image
MongoDB capped collections are great storing volatile information, like access and debug logs. If you need to monitor regular log file, you would probably type something like this in your command line: tail -f /var/log/access_log . Cool thing is that you can do the same with MongoDB capped collections : Capped collections are fixed-size collections that support high-throughput operations that insert, retrieve, and delete documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection. Creating a capped collection is easy; so lets create capped collection named log and size of 16MB (the maximum size of BSON object if you did not know): mongo localhost/test > db.createCollection( "log", { capped: true, size: 16777216 } ) Capped collections act like any other collections, but they work in first-in-first-ou

Elastic Load Balancing Supports Cross-Zone Load Balancing

Image
Amazon Web Services announced cross-zone load balancer, which supposed to further improve distribution of requests amongst all the instances behind the load balancer. If you enable cross-zone load balancing, you no longer have to worry that clients caching DNS information will result in requests being distributed unevenly, resulting more even load. I went ahead and tried it out myself. Enabling the Cross-Zone Load Balancing goes through the command line, you need to download and configure the tool on your computer before proceeding. Read the instructions how to download, install and configure ELB command line tools here . List all the load balancers: elb-describe-lbs Enable the cross-zone load balancing elb-modify-lb-attributes loadbalancer-id --crosszoneloadbalancing "enabled=true" Verifying: elb-describe-lb-attributes  loadbalancer-id  --headers CROSS_ZONE_LOADBALANCING  CROSS_ZONE_LOADBALANCING_ATTRIBUTE_VALUE CROSS_ZONE_LOADBALANCING  true Now your

MongoDB Certified DBA

Image
MongoDB University Offers Comprehensive Exams Worldwide NEW YORK, NY- Nov 6, 2013 - I am very glad to announce that I have earned certification as a MongoDB Certified DBA, Associate Level .  About a month ago, I was offered to try out MongoDB Certified DBA exam as part of a pilot project, invitations only. Before that, I successfully passed both M101 and M102P courses (well that was almost a year ago).  Invitations were sent to a selected group only, and only 150 first were given a chance to try that exam. It was thrilling to receive congratulating email in my inbox today, saying that I have successfully passed this quite difficult exam. Certificate number is 100-000-011 which indicates me being 11th in the world getting this (links to verify will be available little bit later). According to The 451 Group's analysis of LinkedIn® member profiles, MongoDB is the most popular NoSQL database and now accounts for 49 percent of all mentions of NoSQL technologies. The forma

AWS OpsWorks - New Resources Tab

Image
AWS OpsWorks updated quietly their back-end today, and introduced new tab called Resources . Most interestingly, I had to update the IP address of one of our client servers because of migration from non-opsworks managed servers to new, OpsWorks managed server. And surprise - there was new tab! What you can do within the Resources tab? Register available volumes to existing stack Assign elastic IP to a different instance (even in different stack) Registering volumes You can find available volume and register volume to a Stack. Then, you can assign the volume to stopped instance, as seen on screen shot. You can change name and mounting point, if needed. Reassigning elastic IPs In my case, I had a server running with elastic IP assigned to it, and I did not want to go in and change A record because it would take too much time and it would probably upset client if DNS update may take up to 72 hours around the globe. Therefore, decided to point elastic IP to n

OpsWorks: Deploying your PHP Application Without Restarting Apache

Image
For quite a while now I have been trying to exploit AWS OpsWorks to deploy software updates to our MetaSearch Gateway with less manual work. Before using OpsWorks, I built AWS servers from custom AMI, deployed app using git push to remote, and remote deployment hook pulled latest code updates to a folder; and if there were any secondary servers, these were updated with rsync. Now, AWS OpsWorks is great set of tools, based on famous Chef (currently 11.4 is supported). With little efforts I was able to include and configure necessary PHP extensions to a default PHP Web app: APC memcached Mongo Geoip Since AWS supports also elastic load balancer, I have two 24/7 instances running behind ELB, and third server (load-based) waiting for load to increase. Deployments are quite easy, with a push of a button, code is pulled from remote git and symlinks are updated. Sounds good? Well, too good to be true. By default, after deploying your app, Apache restarts itself; and naturally this c

MongoDB Backup Strategies with Write-Heavy Applications

Image
The MongoDB Management Service  Backup announced earlier in May provides excellent backup and restore capabilities for your MongoDB replica set cluster. Setting up MongoDB backup is quite easy: Sign Up for service, Provide credit card details Install and configure agent In ~ 10 minutes your replica set is backed up and backup agent starts transferring the oplog data to backup mothership. Price for snapshot creating and storage is negligible, compared to the price ($2/GB) for oplog processing. For example, previous invoice was $0.89 for 330GB Snapshot storage, $0.65 for 71GB for snapshot create and $155 for 77GB of oplog traffic.  After analyzing my application, I figured out that most of the traffic was caused by two collections: metalog - 1GB capped collection which stores every API request with details opcounters - this stores various opcounters; every minute new document is created and $inc operator increases various counter(s) by one, depending on action, e.g. db.o

Make Your phpUnit Tests Run Faster by Sharing Fixtures

Image
There are few good reasons why you may want to share the fixtures between test, and most of the time the reason could be a bad design. For example, you would like to use share database connections, but your database adapter does not implement Singleton pattern . In order to take advantage of sharing fixtures between tests within single Test Case, your test case should implement two public static methods setUpBeforeClass() and tearDownAfterClass() , and shared fixture itself should be protected static variable. Following example shows sharing database fixture between tests: <?php class DatabaseTest extends PHPUnit_Framework_TestCase { protected static $dbh; public static function setUpBeforeClass() { self::$dbh = new PDO('sqlite::memory:'); } public static function tearDownAfterClass() { self::$dbh = NULL; } public function testShouldReturnCountGreaterThanZero() { $cmd = self::$dbh->prepare('SELEC

10gen Announces Company Name Change to MongoDB, Inc.

Image
New York—August 27, 2013—10gen, the MongoDB company, today announced it is changing its name to MongoDB, Inc. The new name more closely unifies the open-source database project with the company behind it. The change is effective immediately. “In 2007, 10gen began work on an open-source cloud computing stack. That was the birth of MongoDB, as the data layer of that platform,” said Dwight Merriman, Chairman and Co-founder at 10gen. “When we saw the potential for the database we had built we decided to focus 100% on MongoDB. Thus the company name 10gen and the database name MongoDB were different. With this change, our goal is to get the names back into alignment.” The MongoDB project, and its mongodb.org community website, are unaffected by this change. 10gen will change its corporate website from 10gen.com to mongodb.com . As part of today’s announcement, 10gen Education, which provides free, online training as well as public and private in-person courses, has been rebranded MongoD

Stubbing and Mocking Static Methods with PHPUnit

Image
PHPUnit has ability to stub and mock static methods. Consider the class Foo : <?php class Foo { public static function doSomething() { return static::helper(); } public static function helper() { return 'foo'; } } ?> To test the static helper() function with PHPUnit, you can write you test like that: <?php class FooTest extends PHPUnit_Framework_TestCase { public function testDoSomething() { $class = $this->getMockClass( 'Foo', /* name of class to mock */ array('helper') /* list of methods to mock */ ); $class::staticExpects($this->any()) ->method('helper') ->will($this->returnValue('bar')); $this->assertEquals( 'bar', $class::doSomething() ); } } ?> The new staticExpects() method works similar to the non-static expects

Finding project files containing console.log()

Image
Found this little snippet online and modified for my own purposes. The goal was to find all template and PHP files containing console.log() operations, left in by other developers and programmers for debugging purposes. You don't want to have these files in your production environment, therefore before committing your code you could (with little modifications) add this to your pre-commit git hook and refuse commit if certain files contain console.log(). egrep -lir --include=*.{php,phtml,tpl} "(console\.log\()" . Following command will find all PHP, PHTML and TPL files in current folder which contain string  console.log( It will simply list all the files. You could pipe it with " | wc -l" to get counts.

View Manual Webspam Actions in Webmaster Tools

Image
From  Official Google Webmaster Central Blog: View manual webspam actions in Webmaster Tools : Webmaster level: All We strive to keep spam out of our users’ search results. This includes both improving our webspam algorithms as well as taking manual action for violations of Google's quality guidelines. In case you have noticed, there is new link under Search Traffic  called Manual Actions . If you click on it, and find message stating "No manual webspam actions found." then congrats, you have done everything right. A recent analysis of Google index showed that well under 2% of domains are manually removed for webspam. If you are one of the few unlucky who do have manual spam action, you probably have message in your inbox (and Google keeps sending these until you take appropriate actions). Here’s what it would look like if Google had taken manual action on a specific section of a site for "User-generated spam": According to Google, Once you’ve

MongoDB: Remove an Arbiter From a Replica Set

Image
Removing an arbiter from MongoDB Replica Set can be quite tricky, and official MongoDB documentation does not say it directly how to do it. I assume that you have auth enabled (just like I have), so lets start with connecting to your ReplicaSet "repl" master as an administrator: mongo mongo.domain.com/admin -u admin -p pass You will see the following prompt: repl:PRIMARY> Type in conf=rs.conf() repl:PRIMARY> conf = rs.conf() { "_id" : "repl", "version" : 8, "members" : [ { "_id" : 0, "host" : "mongo.server.com:27017", "priority" : 2 }, { "_id" : 1, "host" : "mongo1.server.com:27017" }, { "_id" : 2, "host" : "arbiter.server.com:27017", "arbiterOnly" : true }, { "_id" : 3, "host" : "mongo2.server.com:27017" } ] } Now lets adjus

HTTP DNT (Do-Not-Track) Demystified

Image
Do Not Track Logo There has been a lots of buzz around the new DNT (Do-Not-Track) privacy preference. The Do Not Track (DNT) header is the proposed HTTP header field DNT that requests that a web application should disable its tracking of an individual use. This feature is currently being standardized by W3C , and will function similar to Do Not Call registry. Today, there is no clear definition what it means to be "tracked" (according to IETF draft: Tracking includes collection, retention, and use of all data related to the request and response), advertisers aren't legally bound to comply the Do Not Track requests, and it still remains up to application developers to implement. Lets assume you want to be the one app developers who foresees that DNT will be mandatory in next 12 months and would like to incorporate this in your app. How would you detect if user has enabled it? Depending on browser and your app nature, you have two options: Use HTTP headers Use

Google Trends - Visualized + Top Charts

Image
Google Trends updated recently with one cool (yet useless) and one good-looking feature. Visualized hot trends as they appear Google Trends Gets New Monthly Top Charts and Trending Topics Displayed In Color Visualizations. If you ask me, this UI and animation looks too similar to Windows 8 UI. Its cool to watch full screen, but in my opinion its useless and provides no real value besides trying to being cool. Google Trends - Hot Searches looks little bit too similar to Windows 8 UI Click here to open Useless Visualized Google Search Trends . Google Top Charts Now that's something I would like to see more from Google. They have this vast amounts of data being processed through their systems, and I would not even call it big data, gig (antic) data or pet (abyte) data would be more appropriate. What they have done is put together monthly TOP charts for everything thats happening, trending or interesting, and doing it absolutely nice way: Google Top Charts Categories -

Amazon Route53 Health Checks Available in CloudWatch

Image
Route 53 Health Checks Website availability is utmost important if your business depends on it. There are lots of free services (and paid services, too) to constantly monitor your site's availability and send you an email if website is not available. Some of these services offer periodic checks for every 15 minutes, some in 5, and only few support checking your site's health every 60 seconds or less. While this sounds good, what can you do about when your site actually is down? Imagine scenario like this: it's 8pm Friday, nobody is in the office, tech support does not pick up the phone. And your website is down, totally. Yes, you received alert, even multiple alerts. Even some clients call, and your competitors rub hands. Here's what you can do: enable Route53 health checks and host backup site in Amazon S3 (the backup site can simply say that "Sorry, we are currently experiencing technical issues; please check back later." - it is 100 times better than

The First Web Server In America

Image
Did you know that first Web Server in America was installed 22 years ago? That was almost a year after world's first web server was launched at CERN, Switzerland. It contained database of 300 000 research papers. "Today, if you don't have access to the web, you're considered disadvantaged," says physicist Paul Kunz, who on Dec. 12, 1991, installed the first web server in America on an IBM mainframe computer at the Stanford Linear Accelerator Center (SLAC). Here's how SLAC default home page looked like (it was 1991, nobody knew HTML5 back then): Image credit: SLAC In a way, you could say that Paul Kunz built world's first search engine, for 300,000 high energy physics publications in a database called SLAC. It was pretty complicated to use; first of all you needed to have a mainframe account, and second, because of the language the database engine on mainframe used. Later on they added email interface to it, so people could query by email and g

Amazon RDS Switch to MySQL 5.5

Image
Upgrading your Amazon RDS instance from MySQL version 5.1 to MySQL 5.5 has never been easier, as Amazon Web Services announced earlier this week. Now you can modify your MySQL RDS instance using feature called Major Feature Upgrade . MySQL 5.5 includes several features and performance benefits over MySQL that may be of interest to you including enhanced multi-core scaling, better use of I/O capacity , and enhanced monitoring by means of the performance schema . MySQL 5.5 defaults to version 1.1 of the InnoDB Plugin, which improves on version 1.0 (the default for MySQL 5.1) by adding faster recovery, multiple buffer pool instances, and asynchronous I/O. Here's how you can upgrade your RDS instance: Log in to your AWS Console Go to RDS, and select your instance From Instance Actions  drop down, choose Modify Select latest version of MySQL 5.5 from the drop down, To upgrade immediately, select the Apply Immediately check box. To delay the upgrade to the next maintenance wi

Setting Up Tax Rates in Magento Go via Import/Export

Image
Setting proper Tax rates in Magento Go can be pretty frustrating, especially if you need to set up different tax percentages settings per zip code (or zip code range). And for some reason, Magento Go does not come with the default tax settings and they don't maintain the list. Luckily, Magento Go has feature to export tax settings, edit the rates in Excel, and then import them back. Sounds great? Export existing Magento Go tax rates To export your tax file, go to  Sales  →  Tax  →  Manage Tax Zones & Rates Open the file in spreadsheet editor you like, add more tax rules and export the file back to CSV. When you export CSV from Office, then by default the CSV contents looks like normal CSV: Code,Country,State,Zip/Post Code,Rate,Zip/Post is Range,Range From,Range To,default US-CA-*-Rate 1,US,CA,*,8.5,,,, US-NY-*-Rate 1,US,NY,*,8.375,,,, However there is a catch - if you try to import the modified file, it won't work. How to import tax rates properly You need

Amazon AWS Announced Fast Cross-Region EBS Snapshot Copy

Image
Today, Amazon AWS announced yet another performance update to its EC2 Cross-Region EBS Snapshot Copy. Back in December 12, 2012, they announced EBS Snapshot Copy to allow cross-region transfer of existing snapshots for better disaster recovery. Starting today, they claim they will only transfer the data that has changed since your last snapshot copy, thus transferring and storing less data and completing the copy faster. Initiating Snapshot Copy Copy Dialog Copy Progress Even when copying from the first time, the cross-region copy performance is excellent; 10GB disk snapshot was transferred and became available 7 minutes after initiating the copy. EBS Snapshot Copy is simple to use. In the AWS Management Console, you can select the snapshot to be copied, set the destination region, and start the copy. This feature can also be accessed via an EC2 Command Line Interface or an EC2 API. Read more here:  http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-copy-s

Amazon Announces RDS General Availability

Image
+Amazon Web Services  Relational Database Service (Amazon RDS) is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. Amazon RDS gives you access to the capabilities of a familiar MySQL, Oracle or Microsoft SQL Server database engine. This means that the code, applications, and tools you already use today with your existing databases can be used with Amazon RDS. But RDS it's not just any regular cloud MySQL service. It is being used in large-scale, mission-critical applications such as Samsung SmartTV, Flipboard, Pinterest, AirBNB, NASA JPL and many-many more. We have been using Amazon RDS in Multi-AZ deployment (db.m1.large) for more than a year for our CMS PaaS, and can say only good about it. The only big downtime that we encountered was year ago, when there was major downtime in one of US-East regions which largely affected for example Netflix. Again, thanks to a cloud, we were able to relocate our databases to a different se

Magento Go Platform Is Experiencing Major Technical Issues

Image
This morning, very popular e-Commerce SaaS platform Magento Go was experiencing major technical problems - customers' storefronts were not opening, or were constantly timing out. Their support was unable to handle overwhelming amounts of messages for quite some time, and while usually their live chat is great, this time - no help. They posted on their Support page following message: June 2, 2013 7:44:12 AM PDT  —   Some Magento Go stores are currently experiencing issues that are affecting their site. We've identified the root cause of the issue and are currently working to get all stores back up and running.  Thank you for your patience. Please check back every 30 minutes for updates. Still have questions? Please contact Tech Support (support@magento.com) Well, that was few hours after initial problem started. Now, after 6+ hours there is no change. What's most frustrating is that there is no official announcement in their Twitter account  - would be great if they co

Gracefully Exiting node.js / Express Web Application

Image
Running node.js application is easy, but how do you close the application gracefully without errors? Typically you run the app in command line while developing, e.g. node app.js - press Ctrl+C or Command+C to stop running process. If you have sent your application into the background, you can execute kill <pid> command; you can get process ID <pid> with ps aux | grep node . This will kill your app, but if you want to end your application and close all the resources, you can use node.js process  events: // Execute commands in clean exit process.on('exit', function () { console.log('Exiting ...'); if (null != db) { db.close(); } // close other resources here console.log('bye'); }); // happens when you press Ctrl+C process.on('SIGINT', function () { console.log( '\nGracefully shutting down from SIGINT (Crtl-C)' ); process.exit(); }); // usually called with kill process.on('SIGTERM', func

Running your node.js express web apps in port 80

Image
You use node.js and express framework - you have made great choice! By default, express application will run on port 3000, and it is fine. In production environment, you may want to use nginx as a forward proxy (and I will write next post about it) but another way is to run your app in port 80. Assuming that your app.js  already contains following: app.configure(function(){ app.set('port', process.env.PORT || 3000); ... }); In shell, you can type following command: PORT=80 node app.js or if you use forever , then: PORT=80 forever start app.js On your OSX you may get following warning: TypeError: Cannot read property 'getsockname' of undefined The cause for this is general problem - you need to have root privileges in order to run applications in port lower than 1024. To avoid this, you can type sudo PORT=80 node app.js and your application will work normally in port 80. Another common mistake is that you may still have Apache runnin

Google Updated Hotel Price Ads Partner Front-End

Image
Google Hotel Finder There is a newer version of  Google Hotel Finder  HPA partner front-end available. New version allows partners to view their performance across various dimensions such as user location & google site, to manage groups and bids, and to troubleshoot issues. What's new in Hotel Pricing Ads PFE? Group Management The 'Groups' section provides performance information for all properties or for user­defined groups of properties. One of the advantages of creating groups is that you can track performance for the group. Clicking on the group displays performance information for the group. More details on the performance section is found below. Performance The 'overview' tab in the Performance section is a great way to get a high level view of the performance of HPA. At a quick glance, this page displays trends over time, performance by geography and across Google sites. This section allows the user to view overall performance informati

AWS OpsWorks now supports ELB!

Image
Amazon Web Services announced this long-time waited update to its OpsWorks by adding support to Elastic Load Balancer . During AWS Summit 2013 New York it was mentioned several times that ELB support will be added soon  - I am glad they kept their word. Until now, developers taking advantage of Amazon OpsWorks were forced to bring up one instance as HAProxy; but you can now add Elastic Load Balancing (ELB) to your OpsWorks application stacks and get all the built-in capabilities ELB is known for, including automatic scaling across availability zones. Combining this feature with Amazon Route53 latency based routing , and you have highly available, redundant, scalable application. Related article: AWS OpsWorks supports t1.micro instances

10gen Announces MongoDB Backup Service

Image
If you are already using MongoDB Monitoring Service, or MMS, the next logical step here is to sign up for their backup service. MongoDB Backup Service MBS is a cloud-based service provided by 10gen for backing up and restoring MongoDB. Engineered for MongoDB, it features point-in-time recovery and is hosted in reliable, redundant and secure data centers. Update - I got access to MongoDB Backup Service and tried it out! Read more below. MongoDB backup, own way Before MBS was there, I spent some time setting up the backup scripts on one of my secondary servers in replica set, and it works very straight forward once set up: create backup archive and store in S3, and keep only 30 days worth of backups. run mongodump --oplog -u user -p pass mongobackups create backup-YYYYMMMDDHH.tar.bz2 file from mongobackups run /usr/bin/s3put to move file to S3 storage delete mongobackups folder Configure S3 storage lifecycle: Move to Glacier 1 day after creation date Expiration (delete)

AWS OpsWorks Supports t1.micro Instances

Image
Amazon Web Services announced yesterday a long-awaited update to their OpsWorks product by including free tier and t1.micro instance types, something that has always keeping me away of trying OpsWorks, because smallest instance was always m1.small. You can provision now following instance types for your web, database, haproxy, e.t.c.: Micro t1.micro Standard 1st gen. m1.small m1.medium m1.large m1.xlarge Standard 2nd gen. m3.xlarge m3.2xlarge HighMEM m2.xlarge m2.2xlarge m2.4xlarge HighCPU c1.medium c1.xlarge HighIO hi1.4xlarge HighStorage hs1.8xlarge From operating systems only Amazon Linux or Ubuntu 12.04 LTS are supported. As with OpsWorks, you can launch web server stack behind HAProxy (Amazon reps say they will support Elastic Load Balancer very soon), and in addition to 24/7/365 servers you can prepare time-based servers and load-based spare servers, which will start/stop on demand, or "follow-the-sun" approach. AWS OpsWor