A Logo for Informatics

2011
February 2nd
PUBLISHED TO
informatics, web, ideas


HELLO
MY NAME IS
Knowledge
The WC3 recently introduced a logo for HTML 5 in an effort to provide a rally point for next generation web technologies: something for people to stand behind.

It got me thinking: what would a logo for informatics stand for?

The logo was released with much fanfare by the W3C and to some consternation of the web design community. It not only represented the HTML specification itself, but also as an umbrella emblem for a collection of modern web techniques with semantic markup, offline storage, push notifications and CSS 3 among them. Some cried fowl that this was a dilution of the importance of each technique, whilst others were more philosophical.

As it happens, the W3C later clarified their position, but something lodged in my mind: if there were a logo for informatics, what techniques, skills and best practices would it embrace? What would informaticians want to stand behind?

Information & technology

Informatics is all about knowledge. It aims to introduce clarity where there is confusion, to drive understanding through computational methods. But it's more than the sum of its parts.

The cocktail of informatics is two parts programming, one part data modeling, one part statistical reasoning and three parts communication, and an informatician (or data scientist, as the cool kids are calling themselves these days) must employ a wide variety of skills to push forward scientific understanding.

It's also a discipline. A profession. A craft.

With that in mind, what core tenants and techniques would a master informatician define themselves by, or that an informatics newcomer would strive to realise? If we were to make little HTML-style validation badges, for what would they be awarded?

achievement unlocked
Σ
Numerical methods
The literature is packed full of robust models, methods and statistical tips and tricks that when properly applied can reveal information hidden within your data. From Bayesian data analysis to time series, from Monte Carlo simulations to non-parametric statistics. Mastering numerical methods and understanding their application can allow access to deep levels of knowledge within data.
Ψ
Software Best Practice
As Tom Peters would say, implementation is the final 90%, and in informatics that means writing code. Modern software development practices of version control, testing, environmental consistency, automation and documentation all increase the quality of code and the technique that code implements. Design patterns give a common vocabulary to software which reduces the activation energy for others to modify and enhance.
Availability
Improving the availability of a project's resources and related assets can significantly increase impact, both within an institution and the literature. Access to the related materials, methods, data and the workflows in which they were used is a great way to highlight the results they created. Likewise, committing to providing maintenance and updates has obvious benefits, but is often overlooked.
Reproducibility
Not only should informatics applications and their be as available as possible, but they should be made available in a form that allows workflows to be easily executed. This can be challenging with lengthy pipelines or large data sets, but the tools are out there to help distribute and reproduce even complex, multi-tier research. Informaticians should embrace them.
Reuse and interoperability
Great research empowers others, and with the advent of cheap computers and storage informatics has a gigantic opportunity to lower the activation energy for the next guy. Documentation, workflow tools and developer programming interfaces are central to promoting re-use of informatics approaches and adding greatly to the impact depth and breadth of an algorithm, service or project.

These goals are complimentary rather than mutually exclusive. For example, delivering informatics solutions which follow software best practices are more likely to be maintained and remain available. These goals are a means to an end, and their application can result in high quality, reproducible, high impact research.

in conclusion

My aim here, dear reader, is to highlight the very best practice for modern informatics research. Standards we can all stand beside. We can always do more to raise the state of the art, but I would consider these little badges to be the key factors to accelerating research and pre-requisits for the sort of high quality, available, re-usable science we can be proud of.

a note on branding

This post isn't specifically about what a logo for informatics might look like, in terms of styling or design, but more about promoting best practices through a recognisable marque. That said, it's worth taking a moment to consider the profile of informatics in the wider research ecosystem.

Traditionally, informatics has suffered from something of an identity crisis. The webbed-fingered step child of computer science and laboratory, the role of the informatician has not been an easy fit into the wider world of research. Until recently.

As the role of data becomes more pronounced across research domains, and as the volume of that data continues to increase, the informatician is playing an increasingly central, recognised role. The odd melange of knowledge and skills that once made informatics difficult to place are now the exact mix of theory and application needed to push research forward.

It's time we started to dress accordingly.

INTO THE WONDERFUL by  MATT WOOD