Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

New York Times Wipes Journalist's Online Corpus

timothy posted more than 5 years ago | from the check-the-unpersonals dept.

Data Storage 94

thefickler writes "Reading about Peter Wayner and his problems with book piracy reminded me of another writer, Thomas Crampton, who has the opposite problem — a lot of his work has been wiped from the Internet. Thomas Crampton has worked for the New York Times (NYT) and the International Herald Tribune (IHT) for about a decade, but when the websites of the two newspapers were merged two months ago, a lot of Crampton's work disappeared into the ether. Links to the old stories are simply hitting generic pages. Crampton wrote a letter to Arthur Sulzberger, the publisher of the NYT, pleading for his work to be put back online. The hilarious part: according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."

cancel ×

94 comments

first post? (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#27964463)

not possible.

broken links? (2, Interesting)

mcfatboy93 (1363705) | more than 5 years ago | (#27964465)

the NYT is throwing away at least $100,000 for every month that the links remain broken."

now how much would it cost to fix all those links...

no wonder newspapers are not doing well

Re:broken links? (2, Informative)

mysidia (191772) | more than 5 years ago | (#27964657)

according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."

Also according to one analysis: the world is flat.

Apparently the NYT may have a different opinion.

Either that or they're so large $100,000 a month is so insignificant to them it's not the most viable cost-saving/revenue-improving project for them to start at this time.

Re:broken links? (3, Interesting)

pbhj (607776) | more than 5 years ago | (#27965205)

Personally I think that analysis is way out.

I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.

5 of those google links are dated in the last week, which I assume are related to this story.

$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories. Earnings of 500c pm (ie $5 for every 1000 visitors) would mean 20 Million visitors a month are clicking through to his stories specifically and can't be assuaged with any other content.

This would only be a loss if a similar / 404 / search landing page had a lower earnings rate.

Seems unlikely to me - I think this is just [very clever] linkbaiting from someone who, it appears, was sacked from the NYT and is trying to make a living elsewise.

Re:broken links? (2, Interesting)

RealGrouchy (943109) | more than 5 years ago | (#27971857)

$100 000 per month estimated loss presumably is advertising revenue on page hits from links for those stories.

Forgive me, father, for I have RTFA:

According to Compete.com, IHT.com was getting over 1.5 million visitors/month before it shut down. If a third of those visitors were from search and direct old links, 500,000 visitors a month are hitting the dead end in the image above, instead of the page they were looking for. To buy that traffic from Google at $.20/click, you'd have to pay $100,000 a month.

So essentially, the "one analysis" says that if they wanted to buy the very-roughly-estimated traffic they hypothetically lost, it would cost them $100K to do so.

"They'd have to spend $100K to get the traffic they were before" is NOT the same as "they are losing $100K as a result of the lost traffic," which the "analysis" suggests.

- RG>

Re:broken links? (2, Interesting)

cream wobbly (1102689) | more than 5 years ago | (#27973343)

Personally I think that analysis is way out.

I'm seeing 396 results on Google for: "thomas Crampton" site:nytimes.com, out of 1130 results from the NYT on-site search engine.

That's why you're not an investigative journalist. The $100k/mo. estimate is for all IHT articles which were erased in the merger; not just the one author.

Do try to keep up.

Re:broken links? (0)

Anonymous Coward | more than 5 years ago | (#27965089)

It's not the links that cost money.

The stories were removed from the archive. They didn't just remove this guy's stories, they removed all of the IHT archives it seems.

It's expensive to keep data. You need the storage for it, you need to integrate it int your back-ups and if you're using someone like IBM Global Services to manage your infrastructure, $100k a month might not be enough.

Plus $100k/month is just one person's estimate and the site is down so I can't tell if he's reasonable or not.

Re: Expensive Data? (1)

TaoPhoenix (980487) | more than 5 years ago | (#27966707)

I'm sure you are not a troll (YANAT), but the other side of slashdot is discussing how cheap data is, woe to the pay providers.

I think you mean that keeping data in a sophisticated manner is what grinds out IT Admin time, which eventually means a salary to pay. But the data itself is cheap, and 50,000 fellas on here can whack up something simple as a makeshift in a week for $5,000 and a month's supply of pizza&caffeine.

Re: Expensive Data? (1)

tagno25 (1518033) | more than 5 years ago | (#27967261)

But the data itself is cheap, and 50,000 fellas on here can whack up something simple as a makeshift in a week for $5,000 and a month's supply of pizza&caffeine.

I am making a 3TB RAID 10 array (6 1TB drives) rackmount server for around $850

Re: Expensive Data? (1)

TaoPhoenix (980487) | more than 5 years ago | (#27968585)

I'll grant you $150 in misc. expense.

The other $4000 is 2.5 hours of link maintenance a week for two years.

But even so, we agree that $100K is ludicrous. That's the price attitude that killed wall street.

I wish... (0)

Anonymous Coward | more than 5 years ago | (#27964507)

I wish I had someone to wipe my corpus for me. I always soil my fingers.

Wayback machine (4, Informative)

wjousts (1529427) | more than 5 years ago | (#27964539)

Groovy baby [archive.org] .

CNN's website doesn't have as many broken links. (4, Informative)

narfspoon (1376395) | more than 5 years ago | (#27964563)

CNN's website doesn't have as many broken links.
Articles over a decade old still work!
Whoever designed theirs deserves a lot of credit.

Re:CNN's website doesn't have as many broken links (3, Insightful)

noundi (1044080) | more than 5 years ago | (#27964709)

Am I the only one who finds this funny? They've managed to keep archives older than, oh my god brace yourselves, 10 years!!!

Seriously though, don't give them standing ovations simply because everybody else fail. Tell me this in 50 years and I'll honestly clap my hands.

Re:CNN's website doesn't have as many broken links (4, Funny)

thedonger (1317951) | more than 5 years ago | (#27964789)

Tell me this in 50 years and I'll honestly clap my hands.

Pay me $100,000 per month and I'll dishonestly clap my hands right now.

Re:CNN's website doesn't have as many broken links (1)

WillDraven (760005) | more than 5 years ago | (#27965117)

Hell, the way things are right now you could pay me $10,000 a month and I'll gladly clap my hands 40 hours a week in whatever venue you deem most appropriate.

Re:CNN's website doesn't have as many broken links (4, Funny)

Cornelius the Great (555189) | more than 5 years ago | (#27967241)

...I'll gladly clap my hands 40 hours a week in whatever venue you deem most appropriate.

Well now, that depends on what you're willing to have in between your hands while clapping, and how soft your hands are...

Re:CNN's website doesn't have as many broken links (1)

StikyPad (445176) | more than 5 years ago | (#27968995)

I'll do it for $50k, and I'll pretend to be genuinely impressed too!

Re:CNN's website doesn't have as many broken links (1)

vaporland (713337) | more than 5 years ago | (#27976241)

my old national geographic magazines from 50 years ago still work. even the old ads are still there... NEAT!

seriously, in a hundred years there is going to be a huge history gap. it's great to read old magazines and books and newspapers. what is anyone in the year 2100 going to read from 2009? nothing will be printed out or compatible with whatever brain-link stuff they use in the future...

all you will have is old shaky-cam JJ Abrams videos as a record of the early 21st century... sad.

This sucks (3, Insightful)

ILongForDarkness (1134931) | more than 5 years ago | (#27964581)

We've come to rely on being able to find things on the internet, it is sad to think that information might go away and cease to exist. That said, I guess it depends on the contract the writers have whether he has a right to have his body of work preserved or not. I mean if a company pays for your work it is theirs and not yours unless your contract entitles you to it. Once you've sold your work to somebody, they can never have anyone read it and use it to line hamster cages for all they care.

Re:This sucks (2, Interesting)

noidentity (188756) | more than 5 years ago | (#27964611)

Yet another reason why locking up content is wrong. Let it be freely copied, and then ANYONE who finds the work valuable can potentially become a caretaker of the work and keep it accessible online. Then the only way a work would disappear is if nobody has interest or time to preserve it.

Re:This sucks (1)

TaoPhoenix (980487) | more than 5 years ago | (#27966167)

Yep.

Remember this story from a few hours ago?

http://it.slashdot.org/article.pl?sid=09/05/15/0138204 [slashdot.org]

(~)
Locking up content is fun!! Then you can sue Pirates(TM) when someone copies a plane design. But if the site admin "never actually kept an offline copy" then years of data is gone!!
(/~)

(Speaking of which, that story sounds totally bogus. They coded live on production servers?? Sounds like they played dead.)

Re:This sucks (1)

ILongForDarkness (1134931) | more than 5 years ago | (#27966257)

I guess the problem then becomes that sometimes information isn't interesting enough to keep until you have something come up that you need it for. For example say someone rights an article for the editorial section of a newspaper. Not really interesting. Now 20 years later they are running for political office. Those letters are now interesting but they didn't have apparent worth at the time.

I like the idea in a way though that the consumers decide whether the content is useful enough to hang on to. In a way that is what ptp networks can do. Something that is a fad quickly isn't seeded much and becomes less available. Something that is deemed important/enjoyed/whatever is highly available to be retrieved. Another big challenge becomes though, shouldn't the author be able to remove something in some circumstances? Say embarassing photos that they posted on a social site when they were a kid. Should those remain available even without the originator's continued consent?

Re:This sucks (1)

wisty (1335733) | more than 5 years ago | (#27966803)

Just imagine what anthropologists in 1000 years time will think. They will mark this century as the turning point in human history (just like the last one ... and the one before that), and the only evidence of human culture is a few old tape back-ups from slashdot and 4chan.

Re:This sucks (1)

Cornwallis (1188489) | more than 5 years ago | (#27965151)

This reminds me of a scifi story i read probably 35 years ago about a library that had one file drawer filled with all of civilization's information/knowledge. Someone came up with the idea that it needed to be indexed hence another file drawer came into being holding the index. Then a cross index created a need for another drawer and on and on until the world ran out of room. They kept shrinking the size of the drawers until they became "subatomic" in size. They had to hold all the "drawers" on a separate planet until finally, amongst all the billions of index drawers, which if I recall correctly were stored on particles called "nudged quanta" they lost the one drawer that held the original information. I'd love to know who wrote that story so I can read it again!

Re:This sucks (1)

LordNimon (85072) | more than 5 years ago | (#27966359)

I'd love to know who wrote that story so I can read it again!

Probably the love child of Dr. Seuss and Isaac Asimov.

Re:This sucks (1)

History's Coming To (1059484) | more than 5 years ago | (#27966845)

"MS Fnd in a Lbry" by Hal Draper, originally published in The Magazine of Fantasy and Science Fiction, reprinted in "Laughing Space" by Isaac Asimov and Janet Jeppson, ISBN 0860513238

Re:This sucks (1)

Man On Pink Corner (1089867) | more than 5 years ago | (#27973429)

That'd be "Ms Fnd in a Lbry", but you'd do better to go back to the original source material, which would have been JL Borges's The Library of Babel.

Re:This sucks (0)

Anonymous Coward | more than 5 years ago | (#27965529)

We've come to rely on being able to find things on the internet,

You've obviously never used the Microsoft web site :-) every year or two they rearrange the entire thing, breaking all the links from outside sources and half the internal ones. It sucks if you are a Windows programmer and are trying to find documentation on a function in the SDK. (Note that the local help function is usually even more broken (and far slower) than searching the internet).

Re:This sucks (1)

ILongForDarkness (1134931) | more than 5 years ago | (#27966079)

Hehe yeah. more than a GB of help files and still no help. Have you ever read the EULA for MSDN? It has nice phrases like "code supplied as is" etc. No guarantee that it will work, that it is suitable for any purpose, etc. pretty boiler plate. (I've always found it funny though that documentation that says "this is how you do that" isn't held to at least work for what it is sold to work for). Then, it has a bunch of stuff that says in effect, you won't accuse MS of writing bad code, if someone sues them because of a bug in your code that you borrowed from the docs that you will defend them including paying legal fees. Nice, about as evil as you can get.

Re:This sucks (2, Interesting)

tlhIngan (30335) | more than 5 years ago | (#27967765)

Hehe yeah. more than a GB of help files and still no help. Have you ever read the EULA for MSDN? It has nice phrases like "code supplied as is" etc. No guarantee that it will work, that it is suitable for any purpose, etc. pretty boiler plate. (I've always found it funny though that documentation that says "this is how you do that" isn't held to at least work for what it is sold to work for). Then, it has a bunch of stuff that says in effect, you won't accuse MS of writing bad code, if someone sues them because of a bug in your code that you borrowed from the docs that you will defend them including paying legal fees. Nice, about as evil as you can get.

You do realize that it's sample code and thus is used ot illustrate the API in question? As it's illustrative only, it will be missing a lot of essential code in the name of clarity. Stuff like error handling (the docs will tell you what it returns and how it returns it), parameter/return checks on associated API calls, or even input checks. After all, most people want to see how to use the API in a few lines of code, not deal with a 1000-line program because the author decided to check every return value (even the ones to printf()) and abort gracefully in every potential instance. That's not sample code, and extracting the "how do I use this API?" information from it is quite difficult because of the extraneous code.

It's assumed a halfway competent programmer would realize that, and use the API properly with proper error checking and input sanitization. Alas, that isn't the case most of the time, and you'll find the sample code copied-and-pasted into production code by codemonkeys who don't appear to think. If I was a particularly vicious developer at Microsoft, I might code the sample code with known security holes (but any halfway decent programmer would fix since it would be obvious) and then check applications for those holes later...

Bad Merge (1)

reSonans (732669) | more than 5 years ago | (#27964603)

This is so unfortunate. IHT was great before the merge, which was touted as a "new" version of IHT. Instead, they just canned it and attempted to transfer its content to the existing NYT site. And did a dreadful job, it seems.

I understand the logic - newspapers need to cut costs because they can't figure out the internet and it is killing them. But they lost a dedicated reader in me with this move.

The Internet Is the New Library of Alexandria (4, Interesting)

eldavojohn (898314) | more than 5 years ago | (#27964613)

And it's got unlimited space. Strangely enough, some people are adamant about keeping their works out of this library. And I say they have the right to insure the internet forgets about them when they die. This poor soul seems to understand what's going on.

Re:The Internet Is the New Library of Alexandria (1)

mc1138 (718275) | more than 5 years ago | (#27964693)

An interesting point, and perhaps a bigger one points to the eventual shift away from a pay format in terms of a lot of this information. Already we've seen a dramatic rise in piracy or people going after free content. Taking this a step further the market place will eventually push out the pay per use model on a lot of this information be it WSJ or music, or TV, allowing ads for consumer based merchandise to fuel, at least for now the demands on the infrastructure. My question is though how long can this last, will ad revenue hold out at least until some new means comes about, or will the Internet like the other Alexandria Library burn or in this case crumble under overwhelmed and under maintained infrastructure.

Re:The Internet Is the New Library of Alexandria (0)

Anonymous Coward | more than 5 years ago | (#27965005)

An interesting point, and perhaps a bigger one points to the eventual shift away from a pay format in terms of a lot of this information. Already we've seen a dramatic rise in piracy or people going after free content. Taking this a step further the market place will eventually push out the pay per use model on a lot of this information be it WSJ or music, or TV, allowing ads for consumer based merchandise to fuel, at least for now the demands on the infrastructure. My question is though how long can this last, will ad revenue hold out at least until some new means comes about, or will the Internet like the other Alexandria Library burn or in this case crumble under overwhelmed and under maintained infrastructure.

Your post has already mentioned the solution and the problem.

I don't pirate because I'm too cheap to pay for content. I'd love to pay for a perpetual license to use content. But nobody's selling that. So I pirate, because if nobody's willing to sell me a perpetual license anymore, then I've got to assume that someday, that content ain't gonna be there no more. I pirate while I can, because I don't know how much longer the content will be available.

I'm vendor-agnostic on this. Whether it's DRM that depends on a remote activation server (evil vendors who call it "Windows Genuine Marketing", or nice vendors who call it "Steam"), or content that's hosted on someone else's site (as ephemeral as news articles, TV shows, etc), I grab a copy. For media, I'm not interested in streaming (looking at you, Tube), I'm interested in downloading (e.g. keepvid, or other YouTube downloaders). I want to have a local copy because I don't know if the video will be there tomorrow.

Yes, DVDs can be scratched/lost. Hard drives can fail. But those are risks that are largely under my control. I have no say in whether or not Steam's activation server goes dark, nor whether or not some TV show comes out on DVD (hands up, everyone that's still waiting for a legitimate release of Max Headroom?), nor in whether or not some guy in a suit says that it's not "economical" to spare a gigabyte or two (or more precisely, a few person-weeks of work in migrating a newspaper's archives from one content management system to another) for articles older than ten years.

Hard drive space is cheap enough that you don't need to worry about the Library of Alexandria burning down. Keep a local copy of the sections of the Library you're interested in, and an offsite backup. The odds of your copy and the online copies being burned down simultaneously are pretty slim.

Re:The Internet Is the New Library of Alexandria (4, Funny)

PhilHibbs (4537) | more than 5 years ago | (#27965007)

And it's got unlimited space.

The internet is actually nearly full, I hope there is eno

Re:The Internet Is the New Library of Alexandria (0)

Anonymous Coward | more than 5 years ago | (#27972975)

The internet is actually nearly full, I hope there is eno

There is:
http://en.wikipedia.org/wiki/Brian_eno

And he's concerned about data over the long-term, too:
http://longnow.org/

Re:The Internet Is the New Library of Alexandria (1)

Richard Steiner (1585) | more than 5 years ago | (#27966901)

Sadly, data on the internet is currently a lot more volatile than the library of Alexandria, and the internet's contents are likely to survive for much less time. Digital media doesn't have the lifespan of ancient media, even papyrus. :-(

Are broken links the real issue? (1)

Iucounu (1417665) | more than 5 years ago | (#27964619)

The problem IMHO is not so much the broken links, but instead the desire (or lack of...) from the corporate overlord to retain "obsolete" content. Priority was given to the merger of both titles, without considering what makes a newspaper what it is: content.

Re:Are broken links the real issue? (1)

mdarksbane (587589) | more than 5 years ago | (#27970073)

I still find it strange that it seems to be only old-world, *major* corporations that have this problem so badly.

Every random kid's blog and webcomic has archives dating back to the day the thing started and easily accessible.

Re:Are broken links the real issue? (1)

Naturalis Philosopho (1160697) | more than 5 years ago | (#27972471)

Isn't there an "xkcd" strip about that... ;)

Error establishing a database connection (4, Funny)

six11 (579) | more than 5 years ago | (#27964671)

I was interested in reading the analysis that led to the $100,000/month loss per month the guy's work was offline. So doing what you do, I clicked on the link and found it grandly hilarious to receive a 500 error stating: "Error establishing a database connection". Oh, the irony.

Re:Error establishing a database connection (0)

Anonymous Coward | more than 5 years ago | (#27964827)

Well they wanted to show you how not to make a website

Re:Error establishing a database connection (1)

djmurdoch (306849) | more than 5 years ago | (#27965535)

They're slashdotted, losing lots of traffic: so yes, it's ironic. But you can read the article if you want:

Paste the link "http://www.globaltechproducts.com/blog/1734/how-not-to-redesign-your-website-a-marketing-lesson-from-nytimescom/" into Google, you'll find the article in the Google cache.

The (to me questionable) basis for the calculation is that all old International Herald Tribune links are broken. It used to get X million hits per month, which are by a hokey calculation worth $100k.

Re:Error establishing a database connection (1)

sentientbeing (688713) | more than 5 years ago | (#27967063)

Posting that link into google also provides a google search link to _this_ thread.

this comment is missing from google cache, however...

Re:Error establishing a database connection (1)

14erCleaner (745600) | more than 5 years ago | (#27968723)

They assumed that a third of the 1.5m monthly hits are paid click-throughs from Google that are worth 20 cents each, hence the $100K. Pretty bogus. But even better, that article acknowledges that the Times are in the process of migrating the old stories over, so eventually the links will work again anyway.

Links should be permanent (4, Interesting)

code65536 (302481) | more than 5 years ago | (#27964749)

Whenever I redesign my site, I try hard to avoid changing and URLs. But if I do have to change a URL, I always make sure that there is a redirect (preferably a HTTP/301 permanent redirect) that points from the old URL to the new URL. Updating links is not enough, because you will always have links that come from external sites that you don't control, user bookmarks, links found in "Hey, check this article out" e-mails, etc.

This is one of those basic principles of the web that the W3C (and for those who don't pay attention to them, you can substitute that with "plain old common sense" here) strongly recommends.

It means that users can always find and view content. It means that you still retain your ad revenue. It means that you still keep your PageRank for external sites that link. It means less bitrot and a more useful web...

Re:Links should be permanent (1)

MickyTheIdiot (1032226) | more than 5 years ago | (#27964931)

Well.. we have a huge majority of "designers" out there who design to Microsoft dogma and can't even be bothered to even check their web page using Firefox on their own machine right now. They could care less about any type of good practice let alone trying to conform (or even reading in the first place) the ideas that W3C has put out.

None of this is surprising to me in the least... just sad.

Re:Links should be permanent (1)

WillAdams (45638) | more than 5 years ago | (#27965651)

The problem there is this only works if one controls the _entire_ URL.

I had pages on AOL's FTP/webspace since its inception through AOL's ``sunsetting'' those services --- unfortunately, I published a number of papers which had links to http://members.aol.com/willadams [aol.com] so all the printed copies are out of date since there's no way to update them to http://mysite.verizon.net/william_franklin_adams/ [verizon.net]

It's this sort of thing which makes the MLA's decision to omit hard-coded URLs from their references....

http://www.insidehighered.com/news/2009/03/11/mla [insidehighered.com]

But that's ignoring the problem.

William

You're continuing your problem. (1)

Fencepost (107992) | more than 5 years ago | (#27966779)

You complain about how all of your AOL-hosted links ceased to work and how you're unable to update all the places they were used to point to your (currently) Verizon-hosted content. Do you see the problem with this?

The solution to this is to get your own domain, so you retain the ability to move it at will. I started out with my primary domain (http://www.fencepost.net/ [fencepost.net] ) because I wanted a reliable email address after two successive ISPs were bought out. I would never use a carrier-provided email address as my primary, though I probably do have an @sbcglobal.net address that will continue to exist until AT&T decides to kill off the last of that Baby Bell.

As I see it, if you want a "permanent" online presence then you have two options: 1) control it yourself with a domain of your own, or 2) find an entity that you are positive will not cease to exist or restructure your presence out of existence.

Your best bet for #2 is probably an email address through your college (assuming you're a college grad) if your college's Alumni Relations office has set something like that up. Generally these are "forwarder" addresses (@alumni.mycollege.edu) that simply pass mail along to another address that you've provided them with, and sending email with that as your return address may be problematic depending on who your actual mailbox is hosted with. It's also not unheard of for colleges (particularly small/poorly funded ones) to go under. GMail does not qualify for #2. Some associations could be considered to qualify for #2 (e.g. ACM, IEEE) but if you're not using their other services then you're paying several hundred dollars a year just for an email address - a domain is cheaper.

For #1, sure it's going to cost you a few dollars and a little time each year, but anyone who's reading Slashdot should be able to register a domain and set up hosting. Simple registration is under $10/year, and depending on your needs hosting might even be available "free" from your registrar. You can also look at services such as NearlyFreeSpeech.net [nearlyfreespeech.net] , with hosting prices dependent on your traffic and a minimum deposit of $0.25. If all you're doing is email and a small static website that nobody ever goes to, throw a $10 deposit at them and you're probably set for years. (Disclaimer: I've never used these folks, but they're an example of how little it can take to get things started).

Re:You're continuing your problem. (1)

WillAdams (45638) | more than 5 years ago | (#27967747)

Acknowledged. For my part, I've quit putting my homepage URL in papers and instead will just upload stuff to CTAN and point to that.

I looked into registering a domain name, but coudn't find one I liked (not that I like william_franklin_adams) --- ::grrr:: squatters.

William

Re:You're continuing your problem. (0)

Anonymous Coward | more than 5 years ago | (#27971791)

No match for "WILLADAMS.NET".
>>> Last update of whois database: Fri, 15 May 2009 19:59:27 UTC <<<

Re:You're continuing your problem. (1)

SteeldrivingJon (842919) | more than 5 years ago | (#27968985)

The solution might be to place a GUID and keywords in anything you post online, and specify its location by saying "Google GUID1239872129412 Joe Schmoe Lemur behavior paper"

Then if you move hosts, it'll eventually get picked up by search engines and people will be able to find it, even if the URL itself has changed. (Hell, it might even find a copy someone made and posted at their own site.)

Another story about the necessity of backups.... (2, Interesting)

yogibaer (757010) | more than 5 years ago | (#27964753)

I feel for the guy and his lost articles, but I am wondering why he did not keep backups of everything? The stories seem to be gone forever, or else his letter would be about to re-publishing. his stories on his own website.... That is a rather bad case of negligence on the publisher's side , but more so on the part of Mr. Crampton. For comparison: I work with a professional fotojournalist and this guy has been working for 50 years now and has archived everything (more than 1.5 million pictures) like a mad squirrel. If you ask him about an article he wrote in 1961, it takes him about five minutes to find a copy of the article and the raw materials. Everything analog but nonetheless... That makes you wonder if -while embracing digital media and the blogosphere - many journalists have not brought with them the necessary tools to manage and archive their digital assets.

Re:Another story about the necessity of backups... (2, Interesting)

dancingmad (128588) | more than 5 years ago | (#27965107)

It's hard to tell from the linked article (yeah, I read it) but it doesn't seem like Crampton has no copies of the articles (surely he would keep of his own stuff) but that they're just not accessible on the Internet. All the links that should point to them from the NYT and the IHT went kablammo when the two sites merged.

There's no way a back up on his end could fix this problem.

Re:Another story about the necessity of backups... (2, Insightful)

pbhj (607776) | more than 5 years ago | (#27965281)

I feel for the guy and his lost articles, [...]

I feel for him too. Of course the articles aren't his, they are his employers (unless he has a contract that says otherwise) - which is probably why he's bothered. If they were _his_ articles then he could wholesale upload them to his own site and reap the rewards (whatsoever they may be).

And THIS, dear-readers, is why paper will win (5, Interesting)

hacker (14635) | more than 5 years ago | (#27964819)

In the digital age, wiping out thousands of volumes of material takes mere seconds. Permanently. Gone. Poof.

We have books, printed books, which go back hundreds and hundreds of years (well, written material; the printing press is a fairly recent invention).

We don't even have a record of some newspaper articles that came out 5 years ago. We're LOSING our history, not retaining it, because we lack sufficient "printing" to always keep a copy in circulation. Witness the Avism.com [slashdot.org] debacle and hundreds of other cases where this has happened.

Until we can have a hard-copy of digital media which can NOT be changed, edited, altered or redacted... we're lost.

When we all have "Kindle DX2" devices in the classroom for digital copies of our textbooks... what is stopping them from "gently changing" some of the wording over time, over a few years, to permanently alter the way our youth views the history of times they never lived through?

How can you compare one version of a website today, with the one that was there last week? Was anything changed? Was article content "censored" in any subtle way?

We're heading down a very slippery slope, when digital information can't remain static enough to hold through the years, and be validated and verified to be unchanged, with sufficient copies in enough hands, to ensure survivability. The Internet is not the place to "store" things you want to keep for years and decades.

Re:And THIS, dear-readers, is why paper will win (1)

WillDraven (760005) | more than 5 years ago | (#27965295)

When we all have "Kindle DX2" devices in the classroom for digital copies of our textbooks... what is stopping them from "gently changing" some of the wording over time, over a few years, to permanently alter the way our youth views the history of times they never lived through?

What makes you believe this isn't already occuring with paper textbooks? I can't speak for the current crop (as new editions are pushed on schools practically every year) but when I was in middle / high school our social studies and history textbooks were so full of misleading half truths and flat out lies that I had become so thoroughly disenfranchised with the public school system I dropped out the day I turned 16.

Re:And THIS, dear-readers, is why paper will win (3, Insightful)

Samrobb (12731) | more than 5 years ago | (#27965963)

Fahrenheit 451 [wikipedia.org] :

Only six weeks ago, I discovered that, over the years, some cubby-hole editors at Ballantine Books, fearful of contaminating the young, had, bit by bit, censored some 75 separate sections from the novel. Students, reading the novel which, after all, deals with the censorship and book-burning in the future, wrote to tell me of this exquisite irony.

Re:And THIS, dear-readers, is why paper will win (1)

PhxBlue (562201) | more than 5 years ago | (#27966031)

Until we can have a hard-copy of digital media which can NOT be changed, edited, altered or redacted... we're lost.

You mean like these [wikipedia.org] ?

Re:And THIS, dear-readers, is why paper will win (1)

hacker (14635) | more than 5 years ago | (#27966099)

Yes, except the shelf life of standard, single-use, recordable CDs is 5-8 years, max.

What do you envision happening when those CDs "expire" at that point? Copying the data down to a hard drive and re-burn every decade? Not feasible either.

Re:And THIS, dear-readers, is why paper will win (1)

PhxBlue (562201) | more than 5 years ago | (#27966489)

Yes, except the shelf life of standard, single-use, recordable CDs is 5-8 years, max.

I'm going to guess that varies based on the quality of the disc, because I have recordable CDs past that age that still work without a hitch. You're right that paper's the best way to go, but it's not the only way.

Re:And THIS, dear-readers, is why paper will win (1)

hacker (14635) | more than 5 years ago | (#27966547)

The dye-based CDs have significantly LESS shelf life than the etched, commercial versions. Less protective layers, cheaper discs, poor-quality dyes, etc.

Would not like to be a historian in the future (1)

Arimus (198136) | more than 5 years ago | (#27964849)

Much of what we know about past days is from written material. With move towards net everything and the decline in print as the internet changes (and I do not mean just the web; email, gopher, irc, usenet, ftp archives, et al are all prone to this problem) much of our history will be lost to generations to come purely through attrition.

Then we have the problem of changing file formats, media which decays rapidly when compared to paper and decent inks, obsolescence of technology (try finding a laptop with a built in 3.5" floppy drive)...

In years to come this period will become the 2nd dark age.

Magazine websites do this all the time (3, Interesting)

Bill Dimm (463823) | more than 5 years ago | (#27964941)

My company links to articles on a lot of magazine websites, and I'm just amazed at how often the links become broken. Sites get redesigned and they don't bother redirecting the old URLs to the corresponding new locations. Or, even worse, they just discard all of the old articles, or random articles disappear or come up blank or mangled. Does it not occur to them that websites, search engines, and blogs are left with broken links? Do they not realize that people bookmark the articles?

Re:Magazine websites do this all the time (1)

pjt33 (739471) | more than 5 years ago | (#27965871)

Rather than bookmark an article I save a copy to disk. It's the surest way of being able to read it later. Even if the site's admins are competent enough to keep the URL pointing at the right place, there's no guarantee that the article won't disappear behind a paywall.

Re:Magazine websites do this all the time (1)

Chris Pimlott (16212) | more than 5 years ago | (#27966697)

This why Google Notebook is (was) so nice - makes it very easy to copy (with most formatting retained), which keeping the link to where it came from.

I've dabbled with some of the free replacements (like Zotaro) but none have been able to match the features and ease of use of Google's service.

Has it been fixed? (1)

s.d. (33767) | more than 5 years ago | (#27965119)

I clicked on the two links listed at the bottom of the open letter to Arthur Sulzberger (both are IHT links), and both now are redirected to the correct articles on the www.nytimes.com domain. Has the NYT fixed the problem and no one has just bothered to mention that?

Re:Has it been fixed? (1)

pbhj (607776) | more than 5 years ago | (#27965381)

Work for me too. Perhaps the web dudes at NYT were in cahoots to help him get this linkbait up.

No wonder the papers are dying (0)

Anonymous Coward | more than 5 years ago | (#27965145)

This just illustrates how little the print media understand the web. If they want to increase traffic to their site to increase ad revenue they'd make sure they're capturing all views they can, and building a content base that will be an asset for time to come.

Re:No wonder the papers are dying (1)

pbhj (607776) | more than 5 years ago | (#27965353)

Perhaps they want people to buy archive prints or reprints instead of looking up old content?

Do newspapers get many hits for old stories? I'd have thought most people go to a front page or section-page and work from there to get their news?

Jumping the gun? All the articles seem to be up (1)

dtolman (688781) | more than 5 years ago | (#27965209)

I see over 1000 articles (with photos) by this guy on the Times website. And I can access all of them.

Re:Jumping the gun? All the articles seem to be up (1)

Chris Pimlott (16212) | more than 5 years ago | (#27966721)

Read TFA more closely. He has reported for both the Times and the IHT. It's his IHT work that has disappeared, while the Times stuff is still there.

But it's the NYT! (1)

WheelDweller (108946) | more than 5 years ago | (#27965265)

This is the same bunch that complained how (paraphrasing) "George Bush is such an asshole; he's having them DRIVE yellowcake uranium through the streets of Baghdad without concern for the safety of the inhabitants".

Where do I start?

- They'd been in contact with the uranium for some time.
- Maybe they'd teleport, Scotty?
- Isn't this the "Weapon of Mass Destruction" they were hoping not to find?

Of course it was; there was 8 TONS of the stuff, and nearly a ton was close to weapons-grade. But the article was quickly whisked off to the archive so you'd have to PAY to get a copy just days after printing.

The only reason you're learning about it, is that I read the original and posted it here.

THE POINT:

Newspapers, and for that matter news agencies, have a template to fulfill. It's no longer _news_, it's part of a story they tell. They spend all their time printing a paper no one wants, then complain how their customer base is 'un-hip' and "Joe Sixpack" and doesn't get the literary genius packed within. It's nuts!

Let's let them collapse.

Re:But it's the NYT! (1)

makomk (752139) | more than 5 years ago | (#27992857)

Errm... everyone knew Iraq had that stockpile of yellowcake, it wasn't a secret. The trouble was, since the IAEA had been monitoring it and it hadn't been touched, the US couldn't exactly claim that it showed Iraq was making WMDs. Hence the faked evidence of Saddam Hussein attempting to purchase more yellowcake that the IAEA wouldn't know about.

FrosT pist (-1, Flamebait)

Anonymous Coward | more than 5 years ago | (#27965333)

another part of terrible times web site (1)

cinnamon colbert (732724) | more than 5 years ago | (#27965393)

any good /. er could go on and on about the problems of the times website. I actually had to tell them that they needed a button so people could go back or forward one day at a time (any std site for a journal has this feature - look at say amer chem soc journals, there is a button that goes forward or back one issue)

I have repeatedly told them their comments suck and they should have slashcode and wikipedia - can you imagine how much traffic the times website would generate if each of their great articles formed the basis of a community wiki article

and I'm not that good at this - i'm sure any competent experienced person could find 100s of things they are doing wrong (if you want to GIVE the times money, by being a customer, and purchasing an ad, try and figure out how to do it from the website - its embarrassing)

I guess the website is driven by old guys whoose attitude is, it was good enough for hot lead set by hand, its good enough for the web....nyt, rip

Re:another part of terrible times web site (0)

Anonymous Coward | more than 5 years ago | (#27971801)

slashcode is too complicated for newspapers. I just mentioned this on a forum for the Sacramento Bee where they're trying to figure out how to moderate comments. Slightly less trolls than 4chan. I posted this. My disdain for most media really shows.

-

Give registered users an ability to rank comments with a thumbs up/thumbs down. That will mostly take care of the junk. It's still open for abuse by uneducated nuts. It'd be a turf war on gay marriage or US torture.

Chose some Bee employees and let them moderate. I'm sure they wouldn't troll with their name attached so they had to take responsibility for their actions. I've seen Bee reporters troll other blogs which happens without a name attached. Give them the ability to chose other moderators. It can continue this way where non-troll moderators can take over. It's a community service so people will have a sense of pride over there new volunteer job at moderation. Once in a while, organize a "meet and greet" so these folks can drink coffee and talk. Much cheaper than paying some kid minimum wage to moderate.

If you chose that route, you'll still need a "register complaint" form so that one of your underpaid or non-paid interns can check to see if someone is abusing their moderation capabilities. If so, they issue a warning or two and, if it continues, no more moderation for that person. The trickle down effect of giving out super-moderation will eventually water down to make it useless unless there's a stop.

You should also give moderators the ability to disemvowel comments like on boingboing.net. Jerks need to see their stuff in print, but if you remove the vowels of the comments, it takes the fun out of it.

There's an open source (that means free, no cost for a stumbling media empire) moderation system available at slashcode.com which is used to moderate slashdot.org. Of course, you'd want to pay your programmers to modify this code. Media needs to simplify moderation because moderation for the Great Unwashed doesn't need to be done as specifically as educated computer programmers and scientists. Once modified, you can use this moderation across the McClatchy empire, as well as sell a ready-made product to other companies.

Part of the reason general media is dying is because for national/international news, most people check elsewhere. Sorry about your purchase of Knight-Ridder yet? At least your not as bad as AP with insipid lawsuits against google.

You really need to beef up your local coverage. Local beat reporters covering smaller areas/neighborhoods. It's the internet so it doesn't cost much to web publish these stories. Most people like seeing places they recognize and their own names in the news.

Sacramento, at least according to Time magazine, is the most ethnically diverse city in the US. You can't tell that from the Bee. You need a bit of local coverage from different ethnic groups.

That'll also allow you to sell ads to different local businesses. That would be a bit sad for SNR and Midtown Monthly plus the multitude of other neighborhood papers as you take away their ad revenue.

Since this is Sacramento, your political news should be beefed up as well since that will sell. California Journal had a period where it was well read.

Finally, I also think the Bee needs some more muckracking. That's the heart of the 4th Estate isn't it? Bush signed off on torture. Children were sodomized in front of their parents in Gitmo. There's pictures of this. The US imprisons and kills foreign journalists. We bomb foreign journalist offices. Not writing about this weakens respect for journalists which is why the Daily Show is so popular. If you're not going to talk about this, why bother reading your paper? Instead, we watch Jon Stewart and Stephen Colbert and just check the local paper for local news.

With an active and informed comment section, you bring people in which helps everyone. Especially your bottom line.

the real story (1)

rs232 (849320) | more than 5 years ago | (#27965421)

Moving websites is a good time for purging embarrassing stuff, especially the comments section. One wonders what else is missing especially from the archive. Ah, I just read this bit; the archives were erased in the move [blorge.com] . It takes willful action to lose your own archive. At least they didn't go back into the archive and replace the negative bits with adverts, like some other online newspapers do. Job well done I guess :)

A "*pedia" in a far far future (1)

Ektanoor (9949) | more than 5 years ago | (#27965505)

Peter Wayner - author of a famous and well known book on compression algorithms, which managed to survive the Big Howl of Internet due to its relatively popularity on the time it was written. It was recovered thanks to thousands of fragments found in hundreds of hard drives all over the world.

Thomas Crampton - A supposedly journalist for the once famous New York Times. His personality is quite obscure and nothing is known about him, except for a short reference in the once famous Slashdot forum on Internet. He is known for the statement "You erased my career", supposedly written by him in a letter (lost) after supposedly finding that all his work was wiped by the New York Times (reference lost), supposedly a chronic problem of the newspaper in its electronic era (all references lost). Nearly all data on the New York Times has been lost, with exception to a few millions of fragments of articles that may be found now and then, so it is nearly impossible to know who was Thomas Crampton and probably we never will. The statement attributed to him is considered, today, as a markup symbol of the Big Howl of Internet.

Thomas Crampton is an idiot. (2, Informative)

DerekLyons (302214) | more than 5 years ago | (#27965627)

When you read the article, you find one of the main reasons he wants the articles back up is because he himself doesn't have copies of the articles. TFA and Slashdot are full of angst towards the megacorp, but nobody seems to have noted this point.

Re:Thomas Crampton is an idiot. (0)

Anonymous Coward | more than 5 years ago | (#27971191)

If you read the comments, he writes that he has backups of most of his articles. But as all links from other sites to them won't work any longer, he has lost a lot of his audience.

The issue has been resolved. (1)

fondacio (835785) | more than 5 years ago | (#27965657)

Interesting. I got quite upset with the IHT-NYT change a while ago for exactly this reason: many bookmarks and links to news articles that I had made throughout the years evaporated overnight, making me regret not printing or saving the text of those articles when I had the chance. But apparently the NYT has fixed it now. Crampton links to two articles of a scoop he had a few years ago, and they resolve to a new page. And a bookmark that I have on the computer I'm working on now has the same thing, suggesting that they must have transferred their news archive to the new site.

The original bookmark: http://www.iht.com/articles/2009/02/24/opinion/edcardenas.php [iht.com] now resolves to http://www.nytimes.com/2009/02/24/opinion/24iht-edcardenas.1.20395821.html [nytimes.com]

I'll try it later with my other bookmarks, but it seems like they have responded to the criticism well.

Re:The issue has been resolved. (1)

fondacio (835785) | more than 5 years ago | (#27965791)

Apologies for the reply to self, but I tried a few more links which did not resolve, but the current IHT landing page [nytimes.com] says it all: "The most recent IHT articles can now be found by searching NYTimes.com. We are in the process of moving IHT articles dating back to 1991 over to NYTimes.com. Thanks for your patience as we complete this transition."

mod Down (-1, Redundant)

Anonymous Coward | more than 5 years ago | (#27966105)

God, let's 7ucking troubled OS. Now that they sideline partner. And if parties, but here survive at all

$100k / month? (1)

DoofusOfDeath (636671) | more than 5 years ago | (#27966149)

The hilarious part: according to one analysis, the NYT is throwing away at least $100,000 for every month that the links remain broken."

Analyses are a dime-a-dozen, and as we know from past experience, analysts are often biased, stupid, or insane.

So does it really matter than one analyst came up with a number that, if true, would make NYT look foolish?

His political leanings? (1)

mi (197448) | more than 5 years ago | (#27966211)

I don't know anything about this gentleman, but, maybe, his writings simply go against the current Illiberal pro-Democrat bias of the paper? They weren't always this way — most famously, NYT used to be against government-mandated minimum wage [ncpa.org] until 1999.

Perhaps, they are trying to score some favors from the current government in the hopes of getting substantial financial help (a bailout [washingtontimes.com] , that was, no doubt, already promised to them) and certain writers are no longer welcome?

One does not need to be a "rabid partisan" to fall into disfavor — until recently NYT weren't hiring such partisans anyway. Just not participating in the adoration fest [thepeoplescube.com] could've been enough. When the company survival is at stake, one can't afford taking chances...

Waaaaah (1)

commodore64_love (1445365) | more than 5 years ago | (#27966353)

"They took my work and erased it! Please mommy help me!" - That's one solution. The other solution is for this journalist to get off his fat ass, buy a personal website, and publish all his back work for everyone to see.

You know, when I left Lockheed ten years ago most of my work ended-up in the dumpster too. That's life. If I felt it was important enough to publish, I'd simply copy it to my c: drive and later my personal website. It's a much simpler solution than whining to my ex-boss. It's MY job to preserve my work, not his.

This journalist thinks his work is so all-important. Well then he should be willing to put up the money to publish it.

Re:Waaaaah (1)

seventhc (636528) | more than 5 years ago | (#27966409)

You have a C: drive?

Welcome to the Web (3, Insightful)

sjvn (11568) | more than 5 years ago | (#27966541)

One of the greatest delusions that people have about the Web is that almost all information can be found on it somewhere. What total nonsense.

Stories rot from the Web faster than newspaper print ever has or ever will. All that we're left with is the most recent version or revision, which may have *nothing* to do with what was first written.

If you don't keep copies of your work that appears on the Web, you might as well have thrown them into a fire-place. And, as for everyone else, if you assume for even a moment that what you read on the Web about what happened even in technology news even five years reflects what people really wrote and thought at the time, you're a fool.

It's thanks to delusions like this that, for example, people can argue sincerely that Windows is popular because it's good; and not because Microsoft forced a monopoly on hardware vendors. Almost all the reports of DoJ vs. Microsoft from the time are long gone now. The proof that Microsoft's products are only popular because Microsoft made damn sure that no one else would have a chance to compete against them has vaporized.

The only thing newsworthy about what's happened here is that people think that stories disappearing like this is in any way what-so-ever noteworthy. It happens every day.

Steven

Fake figures - who says $100K (1)

fantomas (94850) | more than 5 years ago | (#27969597)

So where did the value of $100,000 come from?

"To buy that traffic from Google at $.20/click, you'd have to pay $100,000 a month"

So google says its worth 20 cents a click. What if I say it's only worth a cent a click then its worth $5000, or perhaps at 0.1 cents a click its worth $500.

All make believe. Don't tell me "an expert told you so" because I think a bunch of "experts" called "bankers" just got discredited a few months ago for overvaluing other virtual sales... ;-)

Except I guess this is America so the writer is probably getting ready to sue the paper for his $100,000 lost one month's earnings on the grounds that he read it on the internet that he's lost $100K. Not bad for a bloke who probably usually earns $2000 a month but will keep really quiet about the actual figure he actually earns :-)

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...