Tuesday, May 1, 2012

Discover: Improved personalization algorithms and real-time indexing

We are beginning to roll out a new version of the Discover tab that is even more personalized for you. We’ve improved our personalization algorithms to incorporate several new signals including the accounts you follow and whom they follow. All of this social data is used to understand your interests and display stories that are relevant to you in real-time.

Behind the scenes, the new Discover tab is powered by Earlybird, Twitter's real-time search technology. When a user tweets, that Tweet is indexed and becomes searchable in seconds. Every Tweet with a link also goes through some additional processing: we extract and expand any URLs available in Tweets, and then fetch the contents of those URLs via SpiderDuck, our real-time URL fetcher.

To generate the stories that are based on your social graph and that we believe are most interesting to you, we first use Cassovary, our graph processing library, to identify your connections and rank them according to how strong and important those connections are to you.

Once we have that network, we use Twitter's flexible search engine to find URLs that have been shared by that circle of people. Those links are converted into stories that we’ll display, alongside other stories, in the Discover tab. Before displaying them, a final ranking pass re-ranks stories according to how many people have tweeted about them and how important those people are in relation to you. All of this happens in near-real time, which means breaking and relevant stories appear in the new Discover tab almost as soon as people start talking about them.

Our NYC engineering team, led by Daniel Loreto (@DanielLoreto), along with Julian Marinus (@fooljulian), Alec Thomas (@alecthomas), Dave Landau (@landau), and Ugo Di Girolamo (@ugodiggi), is working hard on Discover to create new ways to bring you instantly closer to the things you care about. This update is just the beginning of this ongoing effort.

- Ori Allon, Director of Engineering (@oriallon)

Thursday, April 19, 2012

Sponsoring the Apache Foundation

Open source is a pervasive part of our culture. Many projects at Twitter rely on open source technologies, and as we evolve as a company, our commitment to open source continues to increase. Today, we are becoming an official sponsor of the Apache Software Foundation (ASF), a non-profit and volunteer-run open source foundation.

Starting today, we are sponsoring The Apache Foundation. We look forward to contributing more and increasing our commitment to @TheASF

— Twitter Open Source (@TwitterOSS) April 19, 2012

The ASF provides organizational, legal, and financial support for a broad range of open source software projects that Twitter consumes and contributes to. One example is the Mesos project, which is now being developed inside the ASF Incubator and is nearing its first official release. Within Twitter, Mesos runs on hundreds of production machines and makes it easier to execute clustered jobs that do everything from running services to handling our analytics workload.

Sponsoring the ASF is not only the right thing to do, it will help us sustain our existing projects at the ASF by supporting the foundation’s infrastructure. We have a long history of contributing to Apache projects, including not only Mesos, but also Cassandra, Hadoop, Mahout, Pig and more. As Twitter grows, we look to further our commitment to the success of the ASF and other open source organizations.

On behalf of the Twitter Open Source Office,
- Chris Aniszczyk (@cra)

Tuesday, April 17, 2012

Introducing the Innovator’s Patent Agreement

Cross-posted on the Twitter Blog.

One of the great things about Twitter is working with so many talented folks who dream up and build incredible products day in and day out. Like many companies, we apply for patents on a bunch of these inventions. However, we also think a lot about how those patents may be used in the future; we sometimes worry that they may be used to impede the innovation of others. For that reason, we are publishing a draft of the Innovator’s Patent Agreement, which we informally call the “IPA”.

The IPA is a new way to do patent assignment that keeps control in the hands of engineers and designers. It is a commitment from Twitter to our employees that patents can only be used for defensive purposes. We will not use the patents from employees’ inventions in offensive litigation without their permission. What’s more, this control flows with the patents, so if we sold them to others, they could only use them as the inventor intended.

This is a significant departure from the current state of affairs in the industry. Typically, engineers and designers sign an agreement with their company that irrevocably gives that company any patents filed related to the employee’s work. The company then has control over the patents and can use them however they want, which may include selling them to others who can also use them however they want. With the IPA, employees can be assured that their patents will be used only as a shield rather than as a weapon.

We will implement the IPA later this year, and it will apply to all patents issued to our engineers, both past and present. We are still in early stages, and have just started to reach out to other companies to discuss the IPA and whether it might make sense for them too. In the meantime, we’ve posted the IPA on GitHub with the hope that you will take a look, share your feedback and discuss with your companies. And, of course, you can #jointheflock and have the IPA apply to you.

Today is the second day of our quarterly Hack Week, which means employees – engineers, designers, and folks all across the company – are working on projects and tools outside their regular day-to-day work. The goal of this week is to give rise to the most audacious and creative ideas. These ideas will have the greatest impact in a world that fosters innovation, rather than dampening it, and we hope the IPA will play an important part in making that vision a reality.

- Adam Messinger, VP of Engineering (@adam_messinger)

Monday, April 9, 2012

MySQL at Twitter

MySQL is the persistent storage technology behind most Twitter data: the interest graph, timelines, user data and the Tweets themselves. Due to our scale, we push MySQL a lot further than most companies. Of course, MySQL is open source software, so we have the ability to change it to suit our needs. Since we believe in sharing knowledge and that open source software facilitates innovation, we have decided to open source our MySQL work on GitHub under the BSD New license. The objectives of our work thus far has primarily been to improve the predictability of our services and make our lives easier. Some of the work we’ve done includes:
  • Add additional status variables, particularly from the internals of InnoDB. This allows us to monitor our systems more effectively and understand their behavior better when handling production workloads.
  • Optimize memory allocation on large NUMA systems: Allocate InnoDB's buffer pool fully on startup, fail fast if memory is not available, ensure performance over time even when server is under memory pressure.
  • Reduce unnecessary work through improved server-side statement timeout support. This allows the server to proactively cancel queries that run longer than a millisecond-granularity timeout.
  • Export and restore InnoDB buffer pool in using a safe and lightweight method. This enables us to build tools to support rolling restarts of our services with minimal pain.
  • Optimize MySQL for SSD-based machines, including page-flushing behavior and reduction in writes to disk to improve lifespan.
We look forward sharing our work with upstream and other downstream MySQL vendors, with a goal to improve the MySQL community. For a more complete look at our work, please see the change history and documentation. If you want to learn more about our usage of MySQL, we will be speaking about Gizzard, our sharding and replication framework on top of MySQL, at the Percona Live MySQL Conference and Expo on April 12th. Finally, contact us on GitHub or file an issue if you have questions.

On behalf of the Twitter DBA and DB development teams,

- Jeremy Cole (@jeremycole)
- Davi Arnaut (@darnaut)

Thursday, March 22, 2012

Security Open House March 29

The past few months have been busy for the Twitter security team: we’ve turned on HTTPS by default for everyone, added great engineers from Whisper Systems and Dasient, and had some stimulating internal discussions about how we can continue to better protect users. We want to share what we’ve been up to and discuss the world of online security, so we’ll be hosting a Security Open House on March 29 here at Twitter HQ. We’ve got a great lineup of speakers to get the conversations going:

Neil Daswani (@neildaswani): Online fraud and mobile application abuse

Jason Wiley (@capnwiley) & Dino Fekaris (@dino): Twitter phishing vect
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.