Data Mining Old Ship's Logs

Sun Aug 03 20:05:00 -0700 2008
manage

Researchers today have a host of advanced techniques to gather environmental data, from satellite imagery to ice core samples, but they don't have much in the way of practical normal weather readings that go back very far..except for the rich history preserved in old ships logs. Tapping into this treasure of data will help to sort out just what is going on with fast cooling and fast warming trends, along with the associated storms and droughts and excessive rainfall, etc, that still bedevils modern climate modelers.

"Ships' officers recorded air pressure, wind strength, air and sea temperature, and other weather conditions," Dr Willis said. "From those records scientists can build a detailed picture of past weather and climate." ed.z.: Certainly a good idea! It will be interesting to see what outlooks and beliefs might be changed based on having a lot longer time frame to look at in a bit more detail.

Data Mining Old Ship's Logs
Sun Aug 03 21:12:32 -0700 2008
manage

Climate Scientists should get a lot of useful data from this source.  Another place they should look at is the logs/records from Lighthouse Keepers who like the sailors, kept a "keen eye" on the weather.

Data Mining Old Ship's Logs
Mon Aug 04 04:12:29 -0700 2008
manage

I suspect that it would still be rather subjective data. After all the records can not possibly be better then the instruments they had to record them with.

And "It really, really, rained a lot today." is possibly a bit helpful, but it will not actually do much to make a statistical model have much more accuracy.

Still it will shed some information --I just hate to think of the possible spin from using data with less accuracy then the modern data they may mix it with.

Data Mining Old Ship's Logs
Mon Aug 04 07:54:56 -0700 2008
manage

I just hate to think of the possible spin from using data with less accuracy then the modern data they may mix it with

If you think that this is going to be a major factor increasing spin doctoring, you must be a real optimist.

Data Mining Old Ship's Logs
Mon Aug 04 08:58:57 -0700 2008
manage

Well, you've got me there.

When I finally managed to stop laughing, I was forced to admit to myself that there was no way I could spin it better then you just did.

Data Mining Old Ship's Logs
Mon Aug 04 09:44:56 -0700 2008
manage

I just hate to think of the possible spin from using data with less accuracy then the modern data they may mix it with.

 

hahaha, that's one of the many huge fallacies of the climate modeling religion and they've been committing that sin for some time now

 

A good $300 (over a $1,000 now) laboratory grade mercury thermometer from the late 1970s, complete with correction factor chart readable to hundreths of a degree, would go  from plus or minus a few tenths of a degree over the range of the device.

 

And the climate modeling dweebs take numbers from a hand made lip blown thermometer from a hundred or more years ago as gospel, and plug that into their supercomputers, and then expect we make and act on policy that spews out.

And now they're talking about going even further back in time, even to when there is no numerical data, and infering from the adjectives used what the conditions were


and then modeling the next century's climate.

 

ho boy, there's an entire mindset of self-delusion and a religion of pseudo-science into which our civilization is falling.

Data Mining Old Ship's Logs
Mon Aug 04 09:49:55 -0700 2008
manage

"into which our civilization is falling."

into which our civilization has fallen.

There, corrected that for you.

Data Mining Old Ship's Logs
Mon Aug 04 11:02:09 -0700 2008
manage

Help!  My civilization has fallen and can't get up!

Data Mining Old Ship's Logs
Mon Aug 04 10:33:04 -0700 2008
manage

"After all the records can not possibly be better then the instruments they had to record them with."

Have you so soon forgotten the differences between "accuracy" and "precision"? If they recorded numerous instrument readings then they'll have a picture of what was going on even if the numbers aren't very accurate.

Data Mining Old Ship's Logs
Mon Aug 04 11:15:02 -0700 2008
manage

But you are assuming each instrument at the time had the same degree of both or even either, precision and/or accuracy, as every other instrument at the time.

And they are taking data from a large period of time --ships logs have been kept about forever --there are no doubt some examples that date back to just after the burning of the library at Alexandria --how much do you you want today's models to rely on that data?

I find this whole idea very doubtful --well too doubtful for being much real help in refining our current models.

Data Mining Old Ship's Logs
Mon Aug 04 13:39:54 -0700 2008
manage

No, I'm assuming that they had a semi-reliable way of measuring these stats since they would be able to tell that the thermometer was broken if it read 0 degrees but they were sweating; even knowing the temperatures were in the range of 70 degrees +/- 5 degrees during certain months of the year at certain geographic locations tells more than you are letting onto; hell we didn't get away from using a water soaked piece of cotton wrapped around a thermometer to determine humidity until semi-recently. And it is possible that they measured several instruments and took an average as well since some could perform as backups if one was broken. Just because the instruments were old does not mean the science is broken.

Data Mining Old Ship's Logs
Mon Aug 04 17:38:50 -0700 2008
manage

I'm going to just let Cory tell you about what is wrong with adding all that semi accurate data to the weather model is:

(This applies to any data driven test, including data driven climate model testing for future predictions of climate change)

"Statisticians speak of something called the Paradox of the False Positive. Here's how that works: imagine that you've got a disease that strikes one in a million people, and a test for the disease that's 99% accurate. You administer the test to a million people, and it will be positive for around 10,000 of them - because for every hundred people, it will be wrong once (that's what 99% accurate means). Yet, statistically, we know that there's only one infected person in the entire sample. That means that your "99% accurate" test is wrong 9,999 times out of 10,000!

Terrorism is a lot less common than one in a million and automated - for terrorism - data-mined conclusions drawn from transactions, Oyster cards, bank transfers, travel schedules, etc - are a lot less accurate than 99%. That means practically every person who is branded a terrorist by our data-mining efforts is innocent.

In other words, in the effort to find the terrorist needles in our haystacks, we're just making much bigger haystacks."

That's just a cut and paste (because I don't want to waste a lot of time trying to show you why the data should not be used).

But my not wanting to spend much time on it does not mean I think you unintelligent --it just means I'm lazy, and I think you can see the problem of adding a ton of not so accurate data to a model we are using to predict the future, if the reason is presented to you in a better manner then I can.

So that's a snip from Cory Doctorow on the problem with false positives --which introducing the tons of less accurate data into the climate modeling system would result in (just a bigger haystack, and a less accurate model to run tests on as well).

And that error rate might be either way in the ongoing discussion about climate change. But either way, for or against any 'side', doesn't matter because we would be making decisions on statistically BAD data --and adding more statistically bad data is not going to help.

Data Mining Old Ship's Logs
Tue Aug 05 07:04:42 -0700 2008
manage

Forgive me if I don't idolize Doctorow like so many others. You need to realize that you're talking about accuracy in probabilities, not data. There's a huge difference in stating percentages of whether an event will occur and whether the temperature readings are off by 10-15%. I know you may not see the difference but if you gamble you know that a 4-6% house edge is enough to keep them in business for a long time but if you are trying to sketch out a picture of what a house looks like, you could be off by 20% and it will still look like a house.

I understand where you're coming from but I believe that if we discover that 100-500 years ago the weather was unseasonably hot and dry for 30 years or so at a time followed by an equal cool and rainy period, we can determine that this "global warming" thing is just mass hysteria(which is my guess). And I don't know about you but I don't have 100 or even 10,000 years to wait on "accurate" data to prove one way or another. I'll settle with a sketch of a house rather than waiting on the gambling system that gets me the royal flush after waiting 100,000 hands when there were plenty of straights, flushes, full houses, etc. that could have been played much better.

Data Mining Old Ship's Logs
Tue Aug 05 07:46:44 -0700 2008
manage

Cory is just Cory --I like his talks, but I don't qualify as a fan boy. He saved me some typing in this case was all --The wikipedia stuff on False Positives was better, but way longer. http://en.wikipedia.org/wiki/Type_I_and_type_II_errors

Personally I also think the current global warming thing is just another bit of nonsense, but the reason I don't like the idea of using this data for much is almost exactly for the reason you gave for using it:

"...100-500 years ago the weather was unseasonably hot and dry for 30 years or so at a time followed by an equal cool and rainy period..."

The problem is with 'unseasonably' how would they know that it was 'unseasonable' (at that point in time they are even worse off then we are now as to knowing what 'seasonable' (normal) is supposed to be.

The whole 'Global Warming thing is only an argument over a few (very few) degrees in overall warming --or not, and is it man made --or not.

So any data used that is even a tiny bit more based on just 'unseasonable' --or any pother mostly subjective data, or just less accurate data, is not going to really tell us much and will --in my opine, just add to the already borderline hysterical shouting match.

As to it helping be a recognizable house, I don't think that matters much, because I don't think it matters what we do or don't 'know' (recognize) about it --we are not really going to do anything anyway, and even if we could (let alone would) change our ways, I still don't think it would matter one whit.

It's a large planet and I expect that the changes, no matter how fast they may seem to happen, have their roots in long standing events --in other words if there is to be change it will already be committed, and any attempt we make to 'undo it' would require things, that even if we understood them, would take centuries if not millennia to effect.

Of course that is just another opine, nothing more. But adding what I see as dubious data will just encourage more people to weigh in with opines no more valuable then mine...

Surely you would agree we don't need that, do we?

Data Mining Old Ship's Logs
Fri Aug 08 15:28:17 -0700 2008
manage

Telescopes right now use fuzzy data that is aggregated over time to produce sharper images.  How is this any different?  Any time you sample the real world, you have error.  Scientists know how to use statistics to understand the data and the error inherent in it.  Larger error means you need more data to give an equivalent confidence.  My guess is that there is a lot of data here available for use.

I mean seriously, don't you think it would be useful even to know if it rained on any given day in a certain region over vast periods of time?  That's about as fuzzy data as you can get and you can still glean useful information from it.  All information is useful.

Also, I don't see how the false positive "paradox" even applies.

Data Mining Old Ship's Logs
Fri Aug 08 17:45:48 -0700 2008
manage

You have badly misunderstood what the notion of correction of fuzzy data means in relation to telescopic use. Here is a simple primer:
http://www.astronomynotes.com/telescop/s11.htm

From the site above:
Speckle interferometry can get rid of atmospheric distortion by taking many fast exposures of an object. Each fraction-of-a-second exposure freezes the motion of the object. Extensive computer processing then shifts the images to a common center and removes other noise and distortions caused by the atmosphere, telescope, and electronics to build up a distortion-free image.

I never said such information was useless, simply that it should not be used for climate model prediction, because the net effect of such data would be to decrease the minimal accuracy we now have.

Study the use of statistical modeling a bit (a lot) more and you will see why adding any data that is LESS accurate then what you already know will simply increase (by a large margin) any data based on a statistical model. (The False Positive Paradox)

I provided this link already, but as you must have missed it, I'll do so again:
http://en.wikipedia.org/wiki/Type_I_and_type_II_errors

Data Mining Old Ship's Logs
Fri Aug 08 18:46:36 -0700 2008
manage

How have I misunderstood?  You've repeated back exactly what I said:  repeated sampling + math = clearer understanding.  I didn't really want to get into the specifics of some particular technology, it was just the point that we can sum up low-res pictures to get a high-res picture.

Correct me if I'm wrong, but you're essentially saying that adding a low-res picture to a high-res picture doesn't help the high-res picture get more hi-res.  But we know we can add many low-res pictures together to get a high-res picture, so I don't understand how one more low-res picture doesn't help.

Data Mining Old Ship's Logs
Fri Aug 08 19:06:29 -0700 2008
manage

Repeated sampling "at an incredibly fast rate", all samples with the same degree of initial accuracy (or inaccuracy, if you prefer to look at it that way), and using the same testing (data capture equipment).

Not samples over extended time, from various uncorrelated and only poorly documented sources, using widely varying equipment.

Hope that straightened that out for you. Reading the links provided would help you understand this better.

Data Mining Old Ship's Logs
Fri Aug 08 17:54:33 -0700 2008
manage

Sorry for the double reply, but I was having a slight problem with the login on the site, and in fiddling around with that lost a small but inportant few words in my first reply to you:

This:
"Study the use of statistical modeling a bit (a lot) more and you will see why adding any data that is LESS accurate then what you already know will simply increase (by a large margin) any data based on a statistical model. (The False Positive Paradox)"

Should read:
Study the use of statistical modeling a bit (a lot) more and you will see why adding any data that is LESS accurate then what you already know will simply increase the margin of error (by a large margin) of any data based on a statistical model. (The False Positive Paradox)

Data Mining Old Ship's Logs
Mon Aug 04 19:06:36 -0700 2008
manage

Didn't read where they were going to use this information in modern day computer models.  ....."build a detailed PICTURE of past weather and climate."

" A pelimanary study of 6000 log books has produced results that raise questions about climate change THEORIES ".

Like any historical documents you are only getting the view of the writer of those documents or that information. Just like Darwin's theories.

There are numerous computer climate models used today to predict climate or weather changes.  I know in my country they use at least 3 different computer models for forecasting.  They enter the latest data into all 3 and see which one matches the current weather situation the best as they "the forecasters" see it.

I personally think it is a good idea to look at these old records and ascertain what the climate was like at the time they were written, especially as climate change now makes international news.

Data Mining Old Ship's Logs
Mon Aug 04 19:22:08 -0700 2008
manage

I have absolutely no problem with, and even agree that the use as you have described would be, in fact, useful.

And that is my point: "In theory, practice and theory are the same.. but in practice they are not." (well the quote is close anyway, wish I could remember who to give the credit to... Butler, maybe?).