Monday, February 22, 2016

Big Data Basics

I had the pleasure of attending Old Dominion University's High Performance Computing Day last week, a symposium focused on the possibilities of research computing. At HPC Day, the keynote "Data in Modern Times" was given by Zachary Brown, and it was an excellent reminder of some Big Data foundations for all of us.

Summarized from my notes:

"Big Data" is a new construction. In the past [really tempted to say "ye olden times" here] SQL-driven relational databases were enough--users could derive whatever insight they needed from the database with a structured query, and call it a day.

These days, there are a series of factors, often known as the "Many V's" that help define, or operationalize what people mean by "Big Data." Zachary focused on three:

Velocity

New data is coming in fast. Often too fast to store without some processing to determine what's worth keeping.

Volume

There's so much data that it's unwieldy to handle with standard or legacy methods.

Variety

The data isn't neatly organized. It may be very text-based, it may include lots of different attributes, or may otherwise not fit into nice boxes for analysis.

As a side note, as data science evolves, other organizations and pundits are suggesting other V's. It's important to note that the original 3V's were measures of magnitude, as discussed here. "Strategy V's" attempt to address facets of the problem such as Veracity--can the data be trusted? To what level?; Value--Why are we bothering to work with the big data anyway?; Variability--do we understand the context of the data?; and (my favorite) Visualization--how can we present the data in a way that carries meaning? More on all 7 V's here

The keynote continued by addressing ways to "counter" each of the three V's:


To address the needs of Big Data in…
We must be…
Velocity
Agile
Volume
Scalable
Variety
Flexible

Agility, in regards to Big Data, focuses far more on finding the right tools for the right job than on "Agile" software development, or on other "Agile" methods. Embarking on big data analysis can be overwhelming, so preparing to "drink from the firehose" requires dexterity.

Scalability addresses volumetric concerns by planning in advance to create an architecture that can be extended as needed if data continues to grow. Many of the data tools available today work across computer clusters--multiple machines working together to run analyses. That type of architecture works well with a few machines, and just as well with dozens, depending on the needs of the researcher.

Flexibility tackles variety by taking a range of methods to work with Big Data. Unstructured, text-based data? Use machine learning. Messy data? Tools like Open Refine can help scrub your data. Flexible methods allow data scientists to adjust their approaches to suit the problems at hand, rather than cleave to a one-size-fits-really-very-few method.

All in all, the keynote at ODU's High Performance Computing Day served as a great reminder for the various concerns surrounding Big Data, some key aspects of defining "Big Data" as a concept, and some strategic thinking to address the challenge of Big Data. Next steps include putting these reminders into practice as we all handle Big (and smaller) Data in work and life!

Friday, February 5, 2016

Management, Fish, and the Art of Being Present at Work

This week I attended the kickoff session of a six-week professional development opportunity, the Leadership Management Development Certificate (track one), here at Old Dominion. The morning's discussions and activities have prompted a lot of thought, and reflecting here will force me to organize my thinking in ways that might be helpful.

First, I really appreciate a few things the facilitators did to make us all feel welcome--introductions, icebreakers, and a discussion of "agreements" that would serve as ground rules and guidelines for all the sessions in the program. The facilitators also pointed out the "parking lot," a large flipchart designed to capture topics that didn't quite fit into the current discussion. By designating a place to capture those tangents, interesting-but-not-relevant discussions can be "parked" and picked up later when time allows. I'd like to try that model the next time I facilitate a similar workshop--by giving the discussion a place to rest, participants can focus their attention on other topics without worrying about forgetting the "parked" discussion.

The bulk of the morning was focused on the Fish! philosophy of workplace management. Though Fish! has been around for a while, it's always nice to have a reminder of the four main ideas:

  • Choose Your Attitude
  • Play
  • Make Their Day
  • Be There
(The fish is secondary.)

"Play" can take many forms at work, but the important takeaway is the notion that you can--and should--find a way to play and have fun at work. This leads into "Make Their Day," which says that a little extra attention can make a permanent difference in somebody's life. Attitudes can be infectious, and and we all can "Choose [Our] Attitudes" to make sure that we have a positive outlook, and can then share that with our coworkers and the communities we serve. Finally, we all have the option to "Be There"--being present in the day, in the moment, and in the actions we take.

The general tone of the philosophy boils down to the notion that if you love your job, it's going to show, and affect others in positive ways. That we're more productive when we're happy, and that it can help us avoid stress. People are always watching each other, and just one positive agent in the room can radically change the tone of the entire discussion.

Particularly resonant for me was the final point, "Be There." In the last few years, I've more-and-more tried to live mindfully, staying in the present moment as much as possible. Sometimes, it's a lot trickier than it sounds. There's always more work to be done, projects piling up, and a work-life balance to strive for...but when the lists get overwhelming, I'm learning to pause, take a deep breath or three, and dive back in, chipping away at the pile bit-by-bit until I'm happy with the work I've accomplished.

I'm sure there will be more lessons, more reflection, and more to discuss at the next five sessions of LMDC. For now, though...it's time to be present.

Wednesday, February 3, 2016

That'll Teach Me...

I've often been one to overestimate my energy level and capability to DO ALL THE THINGS! There are so many great opportunities, cool ideas, and intriguing projects in the world that it can be really tricky for me to prioritize everything.

This blog was a wonderful place for me to ruminate while I was in graduate school, and still hosts the bulk of my casual writing on librarianship and information topics. That said, I'm slowly starting to write and publish professionally, and the bulk of my time is spent "on the ground" practicing librarianship, not just thinking about it.

So, no more promises to start blogging regularly again. You'll still see occasional pieces posted here, and I have no intention of "retiring" this site, but frequent updates are likely a thing of the past, at least for a while. Thanks for reading, and do keep an eye out for changes as this site transitions to more of a static home for my work on the web.

Cheers!