Previous Entry Share Next Entry
More fun with OP synonyms
I'm getting to the point of diminishing returns, so it's getting to be time for me to give up on trying to polish the data; please forgive the duplications that make their way into the final online Order of Precedence, which will have to be merged by hand after it goes live. I've eliminated many thousands of duplicate records, but I'd be surprised if there are less than a few thousand that make it in. (There are still about 9000 incomplete records -- more than the 6000 I was targeting, but I think we'll have to live.)

But the system continues to be disconcertingly smart. Today, it complained to me that we had duplicate alpha entries for "Elizabeth Vynehorn" and "Muirne ni Cormaic", which led me on a merry chase: I couldn't figure out *why* it had decided that they were synonyms (I have begun to regret not building a system that records the reasoning, which gets pretty subtle and obscure from run to run), but fortunately found her LJ -- I hadn't realized that Muirne had changed her name. So I've updated my copy of the old OP accordingly.

Oh, in case anybody is interested -- one artifact of this project will be my final master copy of the old HTML files. These are massively cleaned-up HTML, and have many errors and duplications of this sort fixed. Folks are welcome and encouraged to refer back to these files after the new system goes live, since they are the data that the new OP will be bootstrapped from. The "alpha", "awards" and "chrono" directories roughly correspond to the files on, but with a great deal of massaging.

And the record for longest "alternate names" field goes (no surprise) to Mistress Nataliia Anastasiia Evgenova Sviatoslavina vnuchka, whose name is so long, and *never* spelled quite correctly in the Court Reports, that she winds up with 515 characters of alternate name field so far. (Far more than the 255 allowed -- I had to introduce some trimming code to keep her entry from breaking the database. I think she'll survive without every single misspelling recorded for posterity in her record.)

Anyway, continuing to plow through, and finish the current round of synonyms. When it is asking me whether Nathaniel Wyatt and Karrah the Mischevious are the same person, we're definitely running out of good guesses. (Yes, there was a reason -- they apparently were inducted into the White Oak the same day. Still, not exactly a high-quality guess...)

  • 1
::blink:: The idea of a world where they ARE the same person is an interesting one though...

Yeah, I'll admit that that was probably the funniest of the guesses that has been offered to date...

(Deleted comment)
Have you received anything under the new name? I don't see any sign of it in the OP. I am specifically *not* making routine changes at this stage of the game -- it will be much, much, much easier and more reliable to do that in the new system once it is up and running, and requests for changes should go to Shepherd's Crook once that's the case.

What I'm doing is just dealing with the massive inconsistencies in the existing data -- in your case, figuring out that "Andrea Caitlin MacIntyre", "Andrea Caitin McIntyre" and "Andrea MacIntyre" are all the same person. (Yes, you are currently in the OP all three ways. That's fairly typical, and without the code I've been writing, those would wind up in the new OP as three separate people...)

(Deleted comment)
Okay, cool. In about two months, drop Shepherd's Crook a note about the name change. The new system has a built-in "send a request" feature, so it should be pretty easy...

I'm sorry about that! I've been meaning to inform the proper authorities for ages, but it kept slipping my mind. I swear, if it's not actually written down on a to-do list, it doesn't get done.

On the bright side, it sounds like the system is working wonderfully. Thanks for creating such a wonderful system!

  • 1

Log in

No account? Create an account