Log in

No account? Create an account
Previous Entry Share Next Entry
Data, Big and Small
This week's notable link from LinkedIn is this delightful roundup of Five Trendy Open Source (Big-Data) Technologies. It goes through some of the newer hot products -- not stuff that's gotten mature like Hadoop, but newer concepts like Storm, Dremel, and Hana. Worth a read if you're doing any sort of big data at work, especially if you are in any way influencing architecture -- the enterprise world is driving advances in data processing at *remarkable* speed.

That said, it makes amusing reading for me right now. Everybody is talking about Big Data as the way to make money from enterprises. So I guess Querki might best be labeled the first truly serious Small Data project I've seen in a surprisingly long time. I'm explicitly not going after enterprise at all, at least not yet. (In a few years, if Querki is successful with consumers, we'll probably spin off a business-focused subsidiary. But first things first.) Indeed, for the time being I'm going to strictly limit the number of Things you can have in a Space, to somewhere in the tens-of-thousands range -- not even pocket change by Big Data standards.

Querki's underlying theory is that, while the Big Data problems are sexy to computer scientists and businesspeople, they have relatively little to do with the ordinary person on the information superhighway. Normal people are always trying to deal with *little* problems, involving only thousands, hundreds or even tens of things to keep track of. They don't care about lightning-speed processing of billions of records -- they care about being able to *easily* manage the small, everyday problems of the real world. And right now, they are looking sadly neglected.

I'm really quite enjoying this: there's nothing more exciting than finding a problem that nobody's dealing with well. Let's see if we can start a Small Data revolution, while the giants are all focused on the mountains in the distance...

  • 1
So, one of the Small Data projects I have is SCA packing lists. No one packing list works for all events, because daytripping, hotel/crashing, cabin camping, tent camping, and Pennsic all have different lists of things that need to go. After continually forgetting certain items, I did at one point start a Microsoft Access database, but never kept up with it. For one thing, I started it when Greg and I started going to events together, and that caused my needs to change significantly. Our packing has been pretty stable for the last couple of years, so it's time for me to start the project again.

Yep -- that's a very simple but typical sort of Querki app, where you just want a lightweight way to keep track of a bunch of slightly structured data. In basic form, it ought to just plain work by the time I open the beta in April.

Now that I think about it, though, it's potentially an interesting use for the Mix-In system that's planned. While you are mainly working in a single Space (a bunch of related Things), each Space will be able to "inherit" from any number of others. So you could potentially have several sub-lists -- one for camping, one for feasts, one for teaching, etc -- and then mix them for particular kinds of events.

Fascinating. I'll have to add that to the Use Case list: it stretches the Mix-In system in some interesting ways, and will force me to think about how to make that easy to set up. Probably won't happen at the beginning, but hopefully I can make that work well...

Yes, sublists are important! My Quire folder was one thing I would forget regularly, because I needed it for some but not all local events. Once you get that up and running, I'd be happy to test it out.

By the way, if at some point you need to start looking at crunching numbers with this, I'd be interesting in helping out.

Thanks! That's a fair ways down the line, but I might take you up on that eventually. (Especially once we start trying to figure out what Querki *should* have for statistical support. Data mining isn't the point of Querki, but some Apps are going to want statistics, I'd bet.)

And I've put this in as an official use case -- thanks...

I'm reminded of an article you probably forwarded to me that detailed how it wasn't the statistical package or the macros that were the killer app for Excel. It was the rows and columns.

Actually, I'm over-thinking this -- Mix-Ins just over-complicate it. Instead, you could do it with trivial Querki operations, like this.

Your Packing Space is mainly an inventory of your SCA stuff, with one Thing per item. These Things have the basic useful info: Name, Location, Number, possibly a photo if that's helpful.

Then you have several Things which are Sets of the items for particular categories of event. The Camping Set would include Tent, Ropes, Camp Stove. The Feast Set would include Napkins, Plates, Goblets. The Court Set would include Cushions, Handwork, and so on. (These are pointers to the actual Things, and overlap would be allowed, which is why we use Sets.)

Finally, for each event you create a trivial page that concatenates the Sets you want and turns them into a checklist.

Still kind of interesting technically: it's finally a motivation for me to implement Sets (which are more correct than Lists in this case), and it'll be interesting to figure out how to make it trivial to set up an event page.

But overall -- yeah, this is the sort of thing you do with Querki...

I was thinking much the same. Sometimes we take the dogs to an event, which entails a particular set of Stuff. Sometimes it's a camping event, which entails a particular (large) set of Stuff. A few things are needed only when we're camping-with-the-dogs (not when we're just camping, and not when we're day-tripping with the dogs). And so on.

So far I've been doing this with the "Packing" app on my iPhone. There's a "list" type associated with a particular trip or kind of trip; you can create a new list by starting with an existing one as a template, then adding and deleting things. A "category" (e.g. "clothing") is broken down into "kinds" (e.g. "pants") and then into "items" (e.g. "shorts", "jeans", "dress slacks"). You can easily add a whole category to a list, then remove the items that you don't actually need for this trip. If you feel like putting the same "item" into multiple "kinds" or "categories", you can, but I think the program treats them as completely different items that coincidentally have the same name. I have the free version of the app; I gather the paid version supports cross-machine syncing.

Not surprising -- I expect many Querki Apps to already exist as one-offs. The main benefit of Querki is to make it ridiculously easy to roll something like that out.

Still, nice to know that somebody's already done a similar analysis, and that it sounds like their functionality should all be doable in Querki at least eventually. (The category/kinds/items hierarchy should be trivial to implement using hierarchical tags, which is going to be a fairly early feature.)

I'm not sure you WANT to duplicate the Packing app's hierarchy: I find it overly rigid myself. Although I just discovered a previously-unexplored corner of the UI that doesn't insist that items have a kind, only a category -- which frequently works better for me.

A somewhat related use-case, which I was planning to type in a few weeks ago but I think I lost my net connection in the middle: my living-history group, La Belle Compagnie (http://labelle.org) wants to inventory its stuff. Each item has the usual descriptive fields, including (in many cases) a photo. Some items are approved for use in a third-person show but not in a first-person show; some are approved for a 1380's scenario and some for a 1410's scenario; some are approved for gentry use and some for commoner use; some are approved for gentry in the 1380's and all ranks in the 1410's; and so on. Most items belong (in real life) to one particular person, who may or may not play the character who "owns" the item in a show. Some items belong to one person but (for reasons of storage or transport) are usually kept at another person's house. And since the members of La Belle Compagnie are scattered from New York City to southwestern Virginia (plus one in Texas), it would be nice if all the members had (appropriate levels of) access over the Net.

Edited at 2012-11-02 05:17 pm (UTC)

I find it overly rigid myself.

Common problem. That's why Querki is going to emphasize TagSets instead. In general, the correct answer to these sorts of categorization problems is going to be a Set[Tag]. Each Tag is arbitrary and hierarchical, and users can add new nodes as they like. Using them as a Set isn't going to be mandated, but is recommended, since most problems turn out to be multi-dimensional once you understand them well enough.

As for the Inventory Use Case, it's already in the list in the general case. You're going to want to add a few properties for your example (Period, Users, Owner), but part of the point of Querki is that adding new Properties will be near-trivial. Remind me when the beta opens (April, with any luck), and I'll be happy to show you how to set it up...

And since the members of La Belle Compagnie are scattered from New York City to southwestern Virginia (plus one in Texas), it would be nice if all the members had (appropriate levels of) access over the Net.

Part of the point of the exercise: access control will be built into Querki from the start. I suspect it'll take a little while to get it just right, but one of the key motivations here is that *most* apps have a social component to them, and you shouldn't have to engage in a lot of extra work to enable that...

  • 1