At PalominoDB, we are not just another remote DBA; we are an integral part of your webops, DBA and BI teams.

Our experts provide...

Comprehensive system audits
Integration with Operations
Integration with Development
24x7 Monitoring and support
Technical Consulting
Security Audits

Meet Our Team
Contact Us
Our References

Blog

Securing MySQL on Linux Systems

by MATT REID May 11, 2012

MySQL is one of the most, if not the most, popular relational databases chosen by internet based startups in the last decade. Although it is a very robust platform and offers many of the necessary features to support the database needs of today's internet giants, it also suffers from some security issues that must be addressed prior to production use.

This article will discuss the necessary steps to secure a basic MySQL installation and additionally cover more advanced topics for general database security as would be applied to general production environments.

Functionality

The following general statements apply to this security discussion:

• This is a discussion of MySQL as installed on Linux; Windows, OSX, Solaris, and BSD installations may differ and the differences are not covered here.

• MySQL will be handling generic web traffic as is commonly found on PHP, Python, Perl, and Java web applications.

• The standard, and included, MySQL administration tools will be used as necessary for daily contact with the database server. Non-standard tools are not discussed.

• Any remote connections not done over SSH, either explicitly or via tunnel, are assumed to be a security risk and are not advised, as they lack encryption and allow passwords to be read in transit by potential attackers.

Pre-Installation Security Recommendations

Before installing any software it is a good idea to harden the operating system. This includes filesystem security features:

• Encrypted filesystems to prevent inspection of data by unauthorized parties

• Intrusion detection systems like Tripwire, which watch for changes to critical system files

• Port lockdown via firewall software to prevent access to system services

• Strong root and OS-user passwords

• Disallowing OS login by application users: set to “nologin” or “/bin/false”

• Setting up sudo for privileged accounts and requiring a password for sudo access

• Running scheduled rootkit detection scripts

• Running an operating system level Access Control List process: SELinux extensions or AppArmor. These programs will restrict system or server processes from accessing data and resources that is not explicitly defined. Due to general misunderstanding or lack of desire to maintain the access controls, these programs are very often disabled by system administrators.

Post Install Security

After MySQL is installed, either via RPM, Deb package, or other means, there are different approaches to securing the initial database. The first option is to execute the script provided with MySQL, named “mysql_secure_installation”. This will go through the following steps, which can also be taken manually if you prefer:

• Checking for existence of the Root password, and if not found it will set one

◦ mysql> SELECT * FROM mysql.users WHERE User=’root’;

◦ mysql> UPDATE mysql.users SET Password=password(‘your password here’) WHERE User=’root’;

• Delete anonymous users

◦ mysql> DELETE FROM mysql.users WHERE User=’’;

• Removing remote access for the root account

◦ mysql> DELETE FROM mysql.user WHERE User='root' AND Host NOT IN ('localhost', '127.0.0.1', '::1');

• Removing the default “test” database

◦ mysql> DROP database test;

• Reloading the user privilege tables

◦ mysql> FLUSH PRIVILEGES;

By default, MySQL will run via the “mysqld_safe” which contrary to the name does not make MySQL safer for security reasons aside from ensuring that the mysqld process is not running under the root user. The mysqld_safe script provides the functionality as follows: “Script to start the MySQL daemon and restart it if it dies unexpectedly”. As such, if one attempts to run mysqld as the root user it will complain and refuse to start. If you are troubleshooting the mysqld process and want to run without mysqld_safe you can run it as follows, via sudo. Your binary location may differ from /usr/local/bin/mysqld so replace as necessary.

$> sudo -u mysql /usr/local/bin/mysqld

MySQL Configuration Considerations

There are several configuration settings that can further secure the database during operation. These settings will be found in the /etc/my.cnf or /etc/mysql/my.cnf file depending on the version of Linux being used.

old-passwords: this allows MySQL to create and authenticate users via the outdated and insecure password hashing from version 4 and older. It is strongly recommended against using this in production.

dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_old-passwords

sql-mode: this allows the administrator to run MySQL in various operating modes. There are many options but for security the recommended minimum setting is “NO_AUTO_CREATE_USER” or “TRADITIONAL”

dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_sql-mode

skip-grant-tables: this starts the mysqld process without any authentication tables, which is useful if one needs to recover or reset a lost root password. However, if you see this option enabled on a production server that is not undergoing recovery it should be seen as a critical security issue and dealt with immediately.

dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_skip-grant-tables

bind-address: this can be used to restrict network and socket access to specific interfaces, thus ensuring that traffic can only originate through the desired interface. An example is a server with multiple network interfaces (eth0, eth1, eth2) that resides on multiple subnets; in some architectures the database will answer only to application servers or users on a specific subnet and thus the mysqld process needs to be restricted to listen for connections on only the required subnet.

dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_bind-address

Data Encryption

MySQL offers in-row data encryption functions. These can be used to ensure that even in the event of a security breach, that the attacker cannot access the data in the database tables without an encryption key. Although not a MySQL specific functionality, the SQL functions for AES_ENCRYPT and AES_DECRYPT, along with a multitude of HASH mechanisms exist for employing in-row data encryption. Read more about this topic here: dev.mysql.com/doc/refman/5.5/en/encryption-functions.html

Database User and Connection Security Considerations

MySQL users, by definition, can connect to the database server and access data. How much or how little access your users are granted will determine the level of security compromise that they pose to potential attackers. The most insecure user that can be created is a user with wildcard remote host connection ability (the ability to connect from any location, any host, any IP address) who has SUPER and GRANT options; this user can create other users and has the ability to shut down the database server process. As such, it is wise to limit the amount of access that a user is granted when creating accounts. Here are some things to keep in mind:

• Limit user access to specific schemas: granting global access to a user allows the user to access all current and future schemas as opposed to a single defined schema.

• Limit originating users connections from specific IP addresses or subnets instead of allowing connections from any host or any network.

• Where possible limit users to connect only from localhost (127.0.0.1) and advise ssh tunnels to be utilized to gain access to localhost.

• Do not grant SUPER privileges to non-administrative users

• Do not grant replication privileges to non replication process users.

• When granting replication processes limit them to only replication privileges.

• When possible, do not use hostnames for host connection privileges, instead specify IP addresses. This removes the risk of DNS spoofing and removes the requirement for a DNS lookup during connection initiation which saves time and resources.

• Require strong passwords for all users and rotate passwords on a defined schedule.

• If running connections over the internet without a VPN, SSH tunnel, or other encrypted means, consider using SSL for all connections. Otherwise passwords can be seen in transit by attackers, similar to FTP.

References for further reading:

dev.mysql.com/doc/refman/5.5/en/general-security-issues.html
dev.mysql.com/doc/refman/5.5/en/security-against-attack.html
www.sans.org/reading_room/whitepapers/application/

Permalink

Posted in MySQL

No Comments

Redis Persistence

by TIM ELLIS Apr 28, 2012

Clients often ask us about the benefits of using key-value stores, such as Redis, for high-volume environments. One of the key benefits that is often cited is the durability offered by Redis persistence (as described in detail here).

Developer Salvatore ‘antirez’ Sanfilippo delves into the topic on his blog. Here are some key questions you should consider as you evaluate Redis in your own environment.

Redis replication (and one of two methods of persistence) is achieved using the AOF (append-only file), which logs all statements that modify data. In the example in the article, there is a DELETE issued on a non-existent key, and that statement doesn't get logged nor replicated to a slave Redis. But in real life, masters and slaves get out-of-sync, and it can be handy to have a statement that does nothing on the master, but when replicated to the slave, has the tangible effect of moving the master and slave closer toward convergence. It seems at least it should be an option to have "no-effect" statements replicate from the master to the slave.

When the AOF gets too large, an AOF rewrite occurs. This is the minimal set of statements needed to log to reproduce the data set as it is in memory:

You may wonder what happens to data that is written to the server while the rewrite is in progress. This new data is simply also written to the old (current) AOF file, and at the same time queued into an in-memory buffer, so that when the new AOF is ready we can write this missing part inside it, and finally replace the old AOF file with the new one.

So what happens when we run out of RAM? That would be a very interesting behaviour to have defined unambiguously. He also doesn't talk about how long it takes to write the in-memory delta to the AOF before the swap-over. I suspect during that time, the server will be largely unresponsive (or should be, to preserve data integrity). On a busy server, there might be millions of entries in the in-memory delta buffer once the new AOF is finished writing.

If you think Redis having persistence means you can serve data from disk, you'd be wrong. The point of persistence is just to get the in-memory-only dataset back after a crash or restart.

AOF rewrites are generated only using sequential I/O operations, so the whole dump process is efficient even with rotational disks (no random I/O is performed). This is also true for RDB snapshots generation. The complete lack of Random I/O accesses is a rare feature among databases, and is possible mostly because Redis serves read operations from memory, so data on disk does not need to be organized for a random access pattern, but just for a sequential loading on restart.

So Redis really is just (a fast) memcached but with some persistence methods.

There is an option for fsync'ing data to the AOF in various ways. Be careful, the "appendfsync everysec" setting is actually worst-case every TWO seconds. It's only every second on average.

When you set appendfsync always, Redis still doesn't do an fsync after every write. If there are multiple threads writing, it'll batch the writes together doing what he calls "group commit." That is, every thread that performs a write in the current event loop will get written at the same time at the end of the event loop. I can't think of any downside to this, as I don't think clients get their response that data was written until the end of the event loop.

Restarting Redis requires either re-loading an RDB (Redis snapshot) or replaying AOF transactions to get to the state before the server was stopped. Redis is an in-memory database, so as you might expect, the start-up times are fairly onerous.

Redis server will load an RDB file at the rate of 10 ~ 20 seconds per gigabyte of memory used, so loading a dataset composed of tens of gigabytes can take even a few minutes.

That's the best case, as we note from the following:

Loading an AOF file [takes] twice per gigabyte in Redis 2.6, but of course if a lot of writes reached the AOF file after the latest compaction it can take longer (however Redis in the default configuration triggers a rewrite automatically if the AOF size reaches 200% of the initial size).

It isn't pretty, but I'm really glad the author is giving so much transparency here. An optimisation mentioned is to run a Redis slave and have it continue serving the application while the Redis master is restarted.

The author notes that in high-volume environments, a traditional RDBMS can in theory serve reads from the moment it's started, in practice that can cause the database to become fairly unresponsive as the disks seek like wild to pull in the required data. He further notes that Redis, once it starts serving reads, serves them at full speed.

At Palomino, we are dedicated to bringing rigor in benchmarking and analysis to the DBMSs that are our core compentencies. In a future post, watch for us to test Redis and put some solid numbers behind some of this. We are interested in seeing how Redis performs during AOF rewrites and how long it takes to start up on typical modern hardware at typical modern loads.

Permalink

Posted in NoSQL

No Comments

The Postgres-XC 1.0 beta release overview

by EMANUEL CALVO Apr 26, 2012

When I heard about this project a year ago, I was really excited about it. Many cluster-wide projects based on Postgres were developed very slowly, based on older (i.e. Postgres-R www.postgres-r.org/) or proprietary (i.e. Greenplum) versions. The features that this project hoped to achieve were ambitious, as we’ll detail in this post. And best of all - this project is based on the 9.1 Postgresql version, which is really up-to-date (at the moment of writing this post, this is the last stable version).

If you are interested in a serious project for scaling horizontally your PostgreSQL architecture, you may visit the official website at postgres-xc.sourceforge.net/ and take a look.

For those who are interested, there will be a tutorial at PgCon this year. As a brief overview, I will try to give you a broad idea for those who want to get involved in the project.

What Postgres-XC can do:

Support multi-master architecture. Data nodes can contain part or all of the data of a relationship.
Transparent view to application from any master. The application only needs to interact with the cluster through coordinators.
Distribute/Replicate per relation (replication, round robin (by default if any unique column is specified), by hash (by default if a unique is specified), by modulo or a set to a group or node)
Parallel transaction execution among cluster nodes.

What Postgres-XC cannot do:

Support triggers (may be supported in future releases).
Distribute a table with more than one parent.

Before you start:

You need to install the most recent versions of Flex and Bison. That’s important because in the last tarball, ‘./configure’ won’t raise error if they are missing, and the error will be prompted once you execute ‘make’. You will need readline-dev and zlib-dev (not mandatory but strongly recommended).

According to the documentation, Postgres-XC should be compatible with Linux platforms based upon Intel x86_64 CPU.

The documentation needs to be improved, so we advise you to try the steps directly and read the help prompted by the commands. For example, the initdb command in the documentation is incomplete, “--nodename” is mandatory in this version. This project is new and has only a few contributors to date, but we hope its community keeps growing. Most importantly, it is great that a beta release was launched earlier than we expected.

Elements of architecture

GTM (Global Transaction Manager)

+ Realize that I say only GTM, not GTMs. Only one GTM can be the manager. For redundancy, you have GTM-Standby and to improve performance and failover GTM-Proxies.

+ The GTM serializes all the transaction processing and can limit the whole scalability if you implement it primitively. This should not be used in slow/wide networks and is recommended to involve the fewest number of switches between GTM and coordinators. The proxies reduce the iteration with the GTM and improve the performance.

+ Uses MVCC technology. That means that it will still use the same control for the concurrency as Postgres.

+ This is the first thing you will need to configure. If you set up everything in the same machine, you will need to create a separate folder for the configuration and files.

Coordinator/s

+ Interface for applications (like a Postgres backend).

+ It has its own pooler (yes, and I think this is great, avoiding more complexity in big environments).

+ Doesn’t store any data. The queries are executed in the datanodes, but...

+ … it has its own data folder (for global catalogs).

Datanode/s

+ Stores the data.

+ It receives the petition with a GXID (Global Transaction ID) and Global Snapshot to allow requests from several coordinators.

Both Datanodes and Coordinator use their own data directory, so keep this in mind this if you are setting both up on the same machine.

Configuring several GTM-Proxies will improve the scalability, shrinking the I/O in the GTM. Plus, you can configure the GTM-Standby to avoid a SPOF of the general manager. It not only provides the GXID, it also receives the node registration (you can trace your log or check the file inside the gtm folder called register.node, it’s binary but is readable) and most importantly, it holds the snapshot of the current status of all the transactions.

Coordinators can point to all the datanodes and can point to the same datanode (as Oracle RAC, but we’re not sure if all the features included in that solution will be available for Postgres-XC). Coordinators connect to the proxies (if you have already configured them) or the main GTM.

Hope you enjoyed the read. We are preparing more cool stuff about this amazing solution, keep in touch!

Permalink

Posted in Postgres

2 comments

New versions of PgPool released - 3.1.3 & 3.0.7

by EMANUEL CALVO Apr 25, 2012

This essential tool for Postgres architectures is continually improving, and is now available in its new releases. Both are bugfix versions.

For those unfamiliar with the tool, it is a middleware with functionality as a load balancer, pooler* and/or replication system for PostgreSQL databases. The 3.1.x versions are compatible with Postgres 9.x, whose streaming replication feature was pushed to Pgpool developers to take advantage of it. This allows the tool to balance queries without using the pgpool-replication technique.

In the 3.1.3/3.0.7 fixes we have:

Allow multi statement transactions in master/slave mode: Transactions with BEGIN, since 3.1 were sent to the slaves/standby servers as well. This brings non desirable effects when the transaction contains DELETE/INSERT/UPDATE, due to the fact that standbys cannot execute writable SQL. (3.1.3 fix)

Important fixes for failover detection and execution. (3.1.3 fix)

Added m4 files to avoid problems when compiling in older operating systems.

Fixed hangup on PREPARE errors.

Fixed memory leak in reset queries.

If you are running 3.1.x against Postgres 9 databases, we strongly recommend you upgrade PgPool due to the fixing in the multi statement feature.

For more information www.pgpool.net/mediawiki/index.php/Main_Page

* If you need only a connection pooler for Postgres, I prefer PgBouncer wiki.postgresql.org/wiki/PgBouncer. It is lightweight and more specific and simply works, without as much configuration.

Permalink

Posted in Postgres

No Comments

MySQL Conference and Expo - 2012 and our company offsite

by LAINE CAMPBELL Apr 19, 2012

Last week was a huge week for PalominoDB. I will admit to being cautiously optimistic about Percona taking over this conference - one of the biggest parts of the year for the MySQL community. That being said, the conference was done quite well. I really applaud Percona for making it happen. In particular, the dot org pavillion was an excellent addition. While I was stuck at the booth most of the event, I had a great view for the sheer variety of attendees coming through, and had the privilege of participating in a number of excellent conversations.

I noticed a number of patterns that seemed to prevail among the conversations. Folks seem eager to move on to MySQL 5.5, finally comfortable with its stability. Administrators are eager to learn more about MariaDB and Drizzle, and how they can differentiate themselves from the Oracle variants. Partitioning is more prevalent as datasets grow, and sharding is becoming almost commonplace. Now the focus is more on the challenging questions around HA - multi-master, synchronous replication and multi-datacenter installations. People seem more interested in commercial additions around the MySQL ecosystem such as Tungsten, TokuDB, ScaleArc, Scalebase and Clustrix.

René Cannao, one of our senior administrators did a great tutorial on understanding performance through measurement, benchmarking and profiling. Feedback has been great, and we look forward to continuing to evolve our benchmarking and profiling methods. Please keep an eye out.

Additionally, we were able to announce a very exciting partnership with SkySQL. PalominoDB focuses on operational excellence by providing top DBAs to our clients. We dig in to our clients' architectures and improve them, maintain them and help redesign them as they grow. Our oncall services are top notch, regularly answering pages in under 5 minutes - and always providing a talented and experienced DBA on the other end of the phone. What we don't do is software support. It's just not our experience. We can tune it, run it and grow it, but for clients who need to dig into code, fix bugs and really provide deep internals knowledge in MySQL - SkySQL are who we turn to. We are also quite excited to help augment SkySQL's excellent services with our own - to create some of the happiest customers out there.

We are thrilled to see our partnership ecosystem grow, our service offerings expand and knowledge of our brand and the quality of services continue to improve. I can't help but glow with pride at the reputation PalominoDB has built - through our DBAs, our clients and our partners. Being a part of the MySQL conference and expo only cemented this pride in community, and pride in our work. Thank you to everyone who has helped us get there.

Thank you all of you!

Permalink

Posted in Conferences, Palomino

No Comm