Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Linux Mountable Storage Pool For All the Cloud Systems?

samzenpus posted about 2 years ago | from the one-raid-to-bind-them-all dept.

Cloud 165

An anonymous reader writes "Many cloud systems are available on the market like: dropbox, google, sugar sync, or your local internet provider, that offer some free gigabytes of storage. Is there anything out there which can combine the storage into one usable folder (preferably linux mountable) and encrypt the data stored in the cloud? The basic idea would be to create one file per cloud used as a block device. Then combine all of them using a software raid (redundancy etc) with cryptFS on top. Have you heard of anything which can do that or what can be used to build upon?"

Sorry! There are no comments related to the filter you selected.

There are several options here (5, Informative)

Omnifarious (11933) | about 2 years ago | (#42577169)

The first, and most interesting, is Tahoe LAFS [tahoe-lafs.org] . It does come with a FUSE driver [tahoe-lafs.org] , so it can be mounted like a regular filesystem. It is cloud-based and redundant to a degree you choose yourself. All copies stored are encrypted, so the only person who can read them is you. I'm not sure though if fetching from more nodes than you strictly need to reconstruct your original file actually buys you anything with that system, but I think it does.

You could also use something like a mountable version of Google Drive and then layer fuse-encfs [arg0.net] on top of it. That's not quite as secure as encrypting at the block layer. The overall shape of your directory hierarchy is available, even if the individual file names and their contents are obscured. That should probably be good enough for most purposes.

Re:There are several options here (4, Interesting)

Omnifarious (11933) | about 2 years ago | (#42577191)

BTW, doing this at a block device level is likely a very poor idea. Block devices are very difficult to get right in a distributed fashion from a synchronization standpoint. They also are likely to cause a lot of excess network traffic since the units the system deals with are poorly matched to the logical units that are actually modified. A good distributed solution to this problem will at have to know something about the fact that you have individual files to be at all reasonable to use.

Re:There are several options here (0)

Anonymous Coward | about 2 years ago | (#42578059)

a better option would be block storage and retrieval. That is, instead of storing the filesystem as a single large file representing a block device, store each filesystem block or block of your files as a separate file. The reasont this is appealing, is that it doesn't reveal to an adversary the size of your files, as it would if you simply encrypted them individually.

Re:There are several options here (5, Interesting)

Omnifarious (11933) | about 2 years ago | (#42578319)

Tahoe sort of achieves this in an odd way. Directories contain hashes of the file they reference instead of an inode number. This means that a Tahoe node often doesn't even know who a file really belongs to, even though it knows its length.

The main issue with block storage is this...

Suppose you modify a data section of a file in a btrfs filesystem mounted on some kind of weird encrypted block device. There will be a whole tree of blocks that get modified, all the way up to the root node. All of these blocks have to be written before the root block is, and for a small file there will be several more blocks that need updating than there are data blocks on the file.

These two issues create a big synchronization problem and a lot of extra traffic.

In contrast, a good distributed filesystem protocol that's aware of individual files can send a single message that contains some kind of identifier for the file, and the new data it should contain. This message will often be smaller than a single filesystem block, and it will also usually be compressed before it gets on the wire. Much more efficient and while there are synchronization issues between updates to individual files, within a file there aren't any.

Re:There are several options here (4, Interesting)

ultrasawblade (2105922) | about 2 years ago | (#42577823)

If you can mount a cloud service as a folder in Linux somehow, then Tahoe-LAFS can work. I know Dropbox lets you do this but am unsure about the other systems. If the cloud service allows upload/download via HTTPS, this could be worked around nontrivially by writing something using FUSE to translate filesystem requests to HTTPS requests recognized by that service.

You would have to have a "client" running for each cloud service. Each client has a storage directory which needs to be configured to be the same as the local sync directory for the cloud service. While Tahoe-LAFS is intended to have each client in a "grid" run on separate machines, there's no reason why multiple clients on the same grid could not be running locally. You'd just have to edit configs manually, setting the IP address to 127.0.0.1 and choosing a different port for each "client", and also making sure the introducer.furl is set accordingly.

Tahoe-LAFS's capability system is pretty neat. Clients never see unencrypted data and you can configure the redundancy and "spread-outness" of the data however you like. Tahoe-LAFS's propensity to disallow quick "deleting" of shares also works well with possibly slowly updating cloud backends - Tahoe is designed to prefer to "age out" shares containing old files periodically rather than support direct deleting.

And Tahoe works as well on Windows as it does on Linux (it's a python script) so if your cloud service is Windows only that is no disadvantage.

Re:There are several options here (1)

Omnifarious (11933) | about 2 years ago | (#42577975)

Oh, yeah. *sheepish grin* Of course, most popular existing cloud services do not support LAFS out of the box. :-/ So yeah, you'll have to layer it on something in the manner suggested above.

Re:There are several options here (4, Insightful)

fuzzyfuzzyfungus (1223518) | about 2 years ago | (#42578083)

I get the impression that, while Tahoe LAFS is the good option, the submitter of TFS is looking for the super-cheap option. He wants some sort of terrifying 'RAID-0-over-a-handful-of-different-interfaces-to-a-half-dozen-free-services-so-I-can-scrape-together-a-couple-gigs-here-and-a-couple-there' amalgamation. Unless he's planning some redundancy, that sounds like a recipe for data loss even if it were simple to set up, and you'd still be looking at a relatively paltry amount of storage space.

It sounds to me like the submitter needs to decide whether he wants to step up and pay for some actual hosts(for which Tahoe LAFS would probably be a good option), or one of the more paranoid dropbox-clones, or whether this is simply an exercise in cobbling scrap together because that can be amusing sometimes...

GlusterFS (3, Informative)

Anonymous Coward | about 2 years ago | (#42577243)

It has optional encrypted transport if you use the native (fuse) mount. Encryption on the back end is on the road map for a future release. It's available for Linux, there's a NetBSD port, and has had working Solaris and OS X support in the past, it probably wouldn't be too hard to make those work again.

Re:GlusterFS (1)

KingRobot (703860) | about 2 years ago | (#42578957)

Yea, Gluster has a really nice API for writing what they call "translators". I would imagine it wouldn't be too difficult to write a translator that presents your cloud service of choice as a storage brick, and from there you just tie it all together as desired.

Re:GlusterFS (0)

Anonymous Coward | about 2 years ago | (#42579717)

We could GlusterFuk our GFS setup by randomly pulling power from machines. We decided it was not ready enough for us.

build upon dmraid (-1)

Anonymous Coward | about 2 years ago | (#42577267)

Sorry. That's a dumb question.

Why do you want to combine them? (4, Interesting)

egcagrac0 (1410377) | about 2 years ago | (#42577269)

If you don't trust the provider to keep your data intact, don't use that provider.

If you need more storage, pay for it. The cost is not prohibitive - 100GB or so for under US$10/mo is pretty easy to find.

If $10/month prices you out of the market, there are better things to worry about than encrypting files and storing them in the cloud.

Re:Why do you want to combine them? (1)

SpaceCracker (939922) | about 2 years ago | (#42577359)

I believe the asker didn't mention a price issue.
Availability is one reason to redundantly "split your eggs into more than one basket". Cloud outages happen from time to time. If one vendor is unavailable (temporarily or closed down indefinately), you want your files to be available from another vendor.

Re:Why do you want to combine them? (1)

egcagrac0 (1410377) | about 2 years ago | (#42577453)

If availability is the goal, duplication (mirroring) would be the way to do it.

While technically mirroring is a mode of "RAID" (RAID 1), typically I hear "RAID" used to mean some form of spanning - combining more than one storage resource into one larger logical storage resource.

As for permanently closing down, if you're a paying customer, you have a reasonable expectation to receive notice that they're terminating the service offering. If you're getting it free, enjoy what you get while you can, and don't expect that complaints will get you more free lunches.

Re:Why do you want to combine them? (1)

icebike (68054) | about 2 years ago | (#42577537)

I agree that mirroring is the way to go, as long as all the cloud servers support some form of user-side encryption.

But I can see being worried about permanently closing down as well.
Does the FBI give notice when they seize the server farm?

Also many of these services, especially the smaller ones are just resellers of Amazon if I'm not mistaken so in some
cases even mirroring might not help, and any sort of raid 5 could leave you with nothing is more than one of
your chosen mirror vendors was ultimately stored on the same upstream provider.

Re:Why do you want to combine them? (0)

egcagrac0 (1410377) | about 2 years ago | (#42577677)

Anyone seriously considering putting their data in the cloud hopefully knows that they should keep a copy for themselves as well.

Beyond that, anyone seriously considering putting their data in the cloud needs to read and understand the SLA and ToS and AUP, and make sure that they all align with your goals. If your goal is to use a provider as a backup of your data, you shouldn't select a provider that specifically says that they don't make backups and won't restore your data in the event of a problem.

It should be easy to understand that blindly putting your data "in the cloud" without careful consideration of what that means and what problems come along with it is like using "rm -rf *" as a file compression utility. Sure, it frees up disk space, but the decompression software isn't working right yet.

Any solution to any problem has costs, benefits, and risks associated with it. One must evaluate the various solutions and select the solution(s) that best align with one's goals.

This AC is on the right track. [slashdot.org] One man's paranoia is another man's reasonable precaution.

Re:Why do you want to combine them? (1)

flyingfsck (986395) | about 2 years ago | (#42578989)

"keep a copy for themselves" If you do this distributed storage software right, then it will be fault tolerant and encrypted.

Re:Why do you want to combine them? (1)

Xtifr (1323) | about 2 years ago | (#42577729)

I think it was pretty obvious that he had mirroring in mind, since what he actually said was "using a software raid (redundancy etc)". (Emphasis mine.) That's the only place he mentioned RAID, so I seriously doubt his goal is striping! :)

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42578923)

Doesn't mean he's looking at RAID1 though. Could be 5 or 6 (or 1+0).

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42577435)

I don't trust any provider with raw unencrypted data with no redundancy outside them. A court order, warrant, hack, etc has and will continue to happen. Amazon has lost data as has Google.

I'm not a worshiper of the clowd clowns so I use commodity hard drives in a NAS with Linux and free tools.

Re:Why do you want to combine them? (2)

Xtifr (1323) | about 2 years ago | (#42577709)

If you don't trust the provider to keep your data intact, don't use that provider.

That's either a ridiculous statement, or completely off-topic. When it comes to reliability, trust isn't an absolute yes/no thing--it's measured in percentages. And redundancy multiplies reliability, so it's a big win.

There's a trade-off for complexity here, and it's possible to question whether all the extra effort is really worth the potential gains in reliability. (Is it really that important to have eight nines instead of four, or ten instead of five?) But there's nothing wrong with investigating the possibility. And he didn't say anything about price. Or amount of storage. Perhaps he's perfectly willing to pay $10/mo three times over, just for the satisfaction of knowing his data is super available.

For that matter, nobody, no matter how reliable, can guarantee you absolute security. Security is also something you have to measure in percentages (though it's a lot harder to estimate accurately). Encrypting your data gives you an extra layer of protection, even if you think your provider's security is good.

Re:Why do you want to combine them? (1)

egcagrac0 (1410377) | about 2 years ago | (#42578051)

If you don't trust the provider to keep your data intact, don't use that provider.

That's either a ridiculous statement, or completely off-topic.

Neither, actually.

In a design like this, I assume that a storage resource - in this case, a cloud provider - will be either online, or offline. If they're offline, I need to work with a different copy of the data. Using a striping arrangement (or striping with parity) rather than a mirrored arrangement means there may not be another copy available.

Re:Why do you want to combine them? (0)

brillow (917507) | about 2 years ago | (#42577739)

Don't tell me what's important for me you arrogant twat.

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42577949)

Sorry.

Re:Why do you want to combine them? (1)

theNetImp (190602) | about 2 years ago | (#42577765)

OK say your a hobby photographer at $10 a month backing up 1TB of images is VERY cos prohibitive. $100/month for cloud storage is terribly high.

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42577847)

OK say your a hobby photographer at $10 a month backing up 1TB of images is VERY cos prohibitive. $100/month for cloud storage is terribly high.

OK, just fess up - it's your pr0n collection, right? 1TB of images at a gargantuan 20MB apiece is over 50000 images; at a more reasonable 5MB that increases to 200k+. "Hobby photographer" my foot.

Re:Why do you want to combine them? (4, Informative)

Gaygirlie (1657131) | about 2 years ago | (#42577903)

OK, just fess up - it's your pr0n collection, right? 1TB of images at a gargantuan 20MB apiece is over 50000 images; at a more reasonable 5MB that increases to 200k+. "Hobby photographer" my foot.

You've clearly never heard of RAW-images. 20MB RAW-image is actually still on the smaller end of the scale.

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42578311)

Yeah, just try shooting on anything higher end. My Leica s2 has munched my harddrive space super fast with 72 meg raw files.

Re:Why do you want to combine them? (-1)

Anonymous Coward | about 2 years ago | (#42578371)

RAW-images are pointless and from photographers who don't understand the tools. Yes, it gets you exactly what the camera records, but there are lossless compression formats to use that will push the file size back down to reasonable sizes. To relate this to music, I understand WAV is better than MP3 for quality, but (as a basic example) WAV+ZIP is still better, and there are much better options than WAV or RAW for audio or images.

Re:Why do you want to combine them? (3, Insightful)

blueg3 (192743) | about 2 years ago | (#42578619)

Most RAW file formats (RAW isn't a file format itself, but a designation covering a number of different formats) already include lossless compression.

Even if they didn't, compressing would create a lot of unnecessary work, because they're mostly valuable in the form that can be manipulated by image editing and management software. Anything other than the actual RAW format is dramatically reduced in value. Again, though, that's a moot point, because RAW formats generally include lossless compression already.

Re:Why do you want to combine them? (5, Informative)

WalrusSlayer (883300) | about 2 years ago | (#42578797)

Uh, methinks you haven't really used tool chains designed to maximize the value of RAW files. The camera's built-in processor does way the hell more stuff than just compress raw pixels into JPEG. White balance is a huge one, along with level curves, sharpening, and a bunch of other stuff. Much of it either one-way or very hard to unwind. And as others have pointed out, most RAW *is* compressed, just lossless.

So yeah, you can fix white-balance in a JPEG, but it's way simpler and more accurate to set the white balance if the pixels haven't already been misbalanced in the first place. Ditto for exposure. Most tools that deal with processed JPEG's don't even have an exposure adjustment---quite often the same tool that does both file types will have an exposure slide if it's RAW but not if it's JPEG. Sure, you can futz with brightness, contrast, levels, gamma, etc to correct an under-exposed shot. But sliding over to +2/3 for a slight underexposure is one click and you're done.

As a guy who has deep-drilled many a software engineering discipline in his 25 year career, and shot tens of thousands of frames as an amateur enthusiast, you can pull me out of the "photographers who don't understand the tools" pool thank you very much.

I have gone back and forth between JPEG and RAW over the years. There have been periods where, with two small children, I simply didn't have time to invest in RAW processing. And I was pleased the neutrality of the DSLR's processing anyway. Other times I knew I was shooting in challenging conditions, and set the camera to RAW+JPEG as a safety net. I've rescued many a shot that way. Recently I've been putting mileage on Lightroom and can extract an immense improvement out of the RAW's that would take me 4x the time to do if they were JPEG, and probably not end up with the same result. I now have more time to invest and the payoff is real and significant.

Re:Why do you want to combine them? (2)

semi-extrinsic (1997002) | about 2 years ago | (#42580003)

Since you say you've alternated between JPEG and RAW shooting, I have two questions (out of genuine interest):

1) For a reasonably well-exposed photo where the white balance is roughly correct in the camera, are you able to produce a significantly better end result from RAW than from JPEG? (I definitely agree on using RAW+JPEG when you know exposure could be a problem)

2) Do you have any rough idea about the bit depth the RAW photos need to be at before you get a significant advantage over JPEG? My old camera produced 10 bit RAWs, and at that time I was almost never able to out-perform the JPEG. My new camera has 12 bit RAW, and I haven't really had much time recently (small children here as well) to play around with RAW. But maybe it would be worth it?

Re:Why do you want to combine them? (4, Informative)

BlackPignouf (1017012) | about 2 years ago | (#42580147)

1) For a reasonably well-exposed photo where the white balance is roughly correct in the camera, are you able to produce a significantly better end result from RAW than from JPEG? (I definitely agree on using RAW+JPEG when you know exposure could be a problem)

Short answer : No
Longer answer : It depends on the light, the sensor, the image processor in camera and your RAW workflow.
From personal experience, I'd say that Canon JPGs are pretty good out of camera, Nikon JPGs lack a bit of sharpening, and Fuji X sensors have very good JPGs that are still impossible to match with RAW+Lightroom.
I use RAW as a safety net during events or weddings, so that if I get a picture with good expression, focus and composition but wrong exposure or WB, I can still save it and print it instead of having to delete it.
RAW is also interesting for scenes with high dynamic range, such as landscapes or concert.

Do you have any rough idea about the bit depth the RAW photos need to be at before you get a significant advantage over JPEG? My old camera produced 10 bit RAWs, and at that time I was almost never able to out-perform the JPEG. My new camera has 12 bit RAW, and I haven't really had much time recently (small children here as well) to play around with RAW. But maybe it would be worth it?

I think it has more to do with dynamic range than with bit-depth. Just find a contrasty scene, take a RAW picture and try to retain details in both shadows and highlights with your RAW conversion software.
http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/dynamic_range_01.htm [dpreview.com]
http://www.dpreview.com/learn/?/Glossary/Digital_Imaging/tonal_range_01.htm [dpreview.com]

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42579233)

"Media" doesn't just mean photography. Most folks also take videos on their photography trips---with perhaps 100gigs of "media" (1-5gigs of images and rest in HD video) per expedition, after a few years, a 3T drive is barely enough to hold 1 copy... (so I got three machines each with two disks, mirrowing each other). This is not a pro- operation, a traveling hobby---and I don't like to erase things (specially since in a few years, 3T will seem tiny---kinda like ``gigabytes'' were huge back in the day).

Re:Why do you want to combine them? (1)

egcagrac0 (1410377) | about 2 years ago | (#42578027)

Then perhaps offline backups are a better choice in your application.

A USB hard drive or two and a safe deposit box should be substantially more affordable.

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42578053)

That hobby photographer will probably need to spend a helluva lot more than $100/month to be able to actually back up 1TB to a remote location. It would take somewhere around 29 weeks worth of uploading on an average "broadband" connection for that amount of data.

Re:Why do you want to combine them? (1)

DarwinSurvivor (1752106) | about 2 years ago | (#42578389)

Not to mention he'd go through about $300/month in overage charges when he hits his cap every month.

Re:Why do you want to combine them? (1)

blueg3 (192743) | about 2 years ago | (#42578621)

It would take about 13 weeks with 1 Mbit up, which is on the low end of readily-available broadband data.

Re:Why do you want to combine them? (1)

donaldm (919619) | about 2 years ago | (#42579183)

If you don't trust the provider to keep your data intact, don't use that provider.

Yes that goes without saying, however you still must say it especially if you are consulting with a customer who is considering cloud storage. A bit like talking about backups. it is amazing this critical service is sometime a low priority with some companies.

If you need more storage, pay for it. The cost is not prohibitive - 100GB or so for under US$10/mo is pretty easy to find.

Yes $10 a month is not that expensive for 100GB however if you consider TB's of data (not that difficult if you consider movies etc) and then cost starts to climb and for a home user that $100/month is starting to get expensive.

If $10/month prices you out of the market, there are better things to worry about than encrypting files and storing them in the cloud.

For companies a professional backup system is much more practical than "Cloud Storage" although this could have a stating cost of a few thousand dollars going up over millions of dollars depending on the backup requirements of the company. For a home user it is actually cheaper to use portable disk drives as your backup service however once problem with that is the fact your data will normally reside in you home unless you have an arrangement to off-site them to a trusted friend or neighbour.

It must be noted that when I am talking about backups in my reply are not really what can be considered "backups". Basically they are actually making a mirrored copy of the appropriate data such as using the rsync command from the file-system or directory structure you want to duplicate and maintain to the target file-system where you wish to have your data mirrored. The target file-system can be on the so-called cloud or a storage device.

Re:Why do you want to combine them? (1)

semi-extrinsic (1997002) | about 2 years ago | (#42580009)

The best solution is to have a friend who also runs a Linux server at home. Or hell, even give your friend an old Linux box and set up a Samba mount on it that he can access from Windows. You then each buy two harddrives, and mirror each other. If you don't trust your friend not to snoop on your photos, or vice versa, use encryption.

Re:Why do you want to combine them? (0)

Anonymous Coward | about 2 years ago | (#42580153)

You can get 1TB of storage for ~$20 per month with Rsync support -> https://backupbay.com/show/plansandpricing
Bitcasa is another cool service which offers 'unlimited' storage for $10, if you trust their encryption though.

Re:Why do you want to combine them? (1)

beachcoder (2281630) | about 2 years ago | (#42580133)

Quite. You can get unlimited data for an absurdly cheap price [jdoqocy.com] .

Don't trust the cloud (5, Interesting)

Anonymous Coward | about 2 years ago | (#42577305)

My residential internet connection via Comcast is fast enough today that I can pull files off of my server at home, "cloud" style.

I have two 2TB drives in RAID1, encrypted with whatever magic `cryptsetup' performs, with port 22 of my firewall forwarded to the server. SSH only accepts logins from me. I consider my data to be more secure and easier to access (it's literally seconds away from availability on any real operating system anywhere with internet access. Windows need not apply) than anything I could get from ZOMG TEH CLOUD. Only disadvantage is speed. I'm not gonna be shunting gigabyte plus files around like this.

Added bonus: easy to add users, easy to throw up a web interface, can do whatever you want with it, since you own the hardware (!!)

Pfft, cloud. I remember when it was called 'the internet'.

Now get the fuck off my lawn.

Re:Don't trust the cloud (4, Funny)

gripped (830310) | about 2 years ago | (#42577437)

SSH only accepts logins from me.

You hope

Re:Don't trust the cloud (0)

Anonymous Coward | about 2 years ago | (#42578253)

SSH only accepts logins from me.

You hope

SSH keys with pass-phrases can do that pretty well.

Re:Don't trust the cloud (2)

DarwinSurvivor (1752106) | about 2 years ago | (#42578393)

If he's using them. AND if he remembered to also turn OFF password authentication.

Re:Don't trust the cloud (2)

dskoll (99328) | about 2 years ago | (#42578579)

Turning off password auth is Basic SSH 101.

Re:Don't trust the cloud (1)

DarwinSurvivor (1752106) | about 2 years ago | (#42580031)

And yet still missed by WAAAAAY to many Linux users. Otherwise those bot-net brute-force attacks would've stopped years ago.

Re:Don't trust the cloud (1)

icebike (68054) | about 2 years ago | (#42577559)

I consider my data to be more secure and easier to access (it's literally seconds away from availability on any real operating system anywhere with internet access.

The thing about a cloud is that there are (if you choose the correct provider) multiple widely separated storage locations with redundant copies.
Your setup, with both of your drives (and I wager also your backup copies) all sit in the same house.

On match. One thief. One flood. One thunder storm.

Re:Don't trust the cloud (0)

Anonymous Coward | about 2 years ago | (#42578171)

> Your setup, with both of your drives (and I wager also your backup copies) all sit in the same house.

Yeah, the data isn't terribly important though. That being said, I don't trust any third party with any of my data (no matter how trivial the information: this is why I dumped Apple and went back to Linux after using OS X from 10.3 to 10.7) unless it's fully encrypted BEFORE it leaves my systems to some remote server. I don't know of any good way of accomplishing this, short of keeping everything in a big encrypted container (created with `dd' for example) and maybe chopping it up for transit. More trouble than it's worth. Suggestions welcomed.

I'm waiting for there to be a good Linux based smartphone on the market with enough space (256 GB would suffice for me) that I can access with SSH. Then I'll write a script to rsync up an encrypted container on the phone with my important files stored on my server's 'vault'. I could easily script up mount/unmount of the container on the phone for on-the-go access, and then I'd have everything on my person all the time. Phone lost? It's all encrypted on the phone anyway.

Can I do that with an Android? I'm still on (old) iPhone 4 / iOS 5.

Re:Don't trust the cloud (1)

icebike (68054) | about 2 years ago | (#42578257)

I don't know of any good way of accomplishing this, short of keeping everything in a big encrypted container (created with `dd' for example) and maybe chopping it up for transit. More trouble than it's worth. Suggestions welcomed.

Yup, rolling your own is kind of painful. You can get it all working then turn your back and its gone to hell on your for no obvious reason.

For that reason, I keep critical records and codebase in SpiderOak.
Encryption happens in your machine, they do not have the key, and couldn't decrypt your data even if served with a warrant.
Might not be suitable for a large collections, if for no other reason than the time and bandwidth involved.

They have free accounts, but I pay them some pittance each year for 100 gig.

Re:Don't trust the cloud (1)

Blaskowicz (634489) | about 2 years ago | (#42577611)

Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I installed a Webdav but thought it looked pretty useless. Fucked around to try to find a usable "web file manager" but I didn't found anything great and don't really know how to install them. Maybe on Windows you could get a setup.exe that set ups everything. Sorry, I don't know how throwing a web interface is "easy", I know a fuck ton about computers and some administration but I have no web dev experience.

At least anyone can use Filezilla (Windows does apply)

Re:Don't trust the cloud (3, Interesting)

Blaskowicz (634489) | about 2 years ago | (#42577635)

btw there's sshfs on Windows, I thought it would be pedantic to mention it but it exists albeit a bit slow.

Re:Don't trust the cloud (1)

jedidiah (1196) | about 2 years ago | (#42577965)

> Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I

That's much like whining about the size of a Windows application's registry hive.

You must also be frightened by any fully featured modern video transcoder.

Re:Don't trust the cloud (3, Informative)

Voyager529 (1363959) | about 2 years ago | (#42579409)

> Easy to throw a web interface? I had installed Apache and looked at the kilometer long configuration file and was horrified. I

That's much like whining about the size of a Windows application's registry hive.

You must also be frightened by any fully featured modern video transcoder.

No, there's a smidge of difference.

The overwhelming majority of Windows applications can be configured using a series of dialog boxes, typically either in the "tools->options" or "edit->preferences" menu. These applications may incidentally store the results of those dialog boxes in a registry hive (or in an ini file in the %appdata% folder or similar), but it's infrequently the only way to make such changes. With Apache, they don't give you a tabbed, categorized dialog box in which to manipulate the options. Similarly, someone who "installed Apache...and was horrified" is probably not well-versed in working with HTTP server software, and thus, editing the Apache config file is going to be a mountain of guesswork as to what you'd really want in the first place. On top of that, there's the "you can usually fidget around to get Apache to do what you want it to do, but be really really careful because the easiest way to get it to work is also usually the most hackable, so if it works right with your instinct, you'll probably have to go back and change it later once you do end up getting it to work".

As for video transcoding, unless you're a masochist who prefers using FFMpeg on a command line instead of the myriad GUI options, video transcoding CAN be as easy as "choose your source video, pick the general format you want it to end up in or the type of device you want it to go on, and click 'transcode'". In those cases, most of the advanced options are optional, and the defaults are generally close to what you want unless you know specifically that you need a particular non-default option somewhere. This is different than trying to get a web server up and running, especially since there's no security consideration to the video transcode.

Re:Don't trust the cloud (2)

Drinking Bleach (975757) | about 2 years ago | (#42577967)

You picked the wrong web server; Apache is great but its configuration is indeed difficult especially if you're not familiar with the concepts. Try out lighttpd, it's pretty dead simple.

Re:Don't trust the cloud (0)

Anonymous Coward | about 2 years ago | (#42577989)

You can skip over 99% of the configuration settings for Apache. Most (if not all) are preconfigured and will work out of the box. Usually all you need to do is put some files in /var/www/ (depending on distribution) and they'll come out of port 80. The windows version comes with it's own installer and it's preconfigured the same way. The installer asks you where to place the server root and all you need to do is place your files there. Firewall configuration however is a different story and not part of Apache so I won't mention it here.

Re:Don't trust the cloud (2)

Voyager529 (1363959) | about 2 years ago | (#42579469)

Making your own web interface for file management? somewhat challenging. Finding a canned one that doesn't utterly suck? Well, that's what Sourceforge is for =)

Ajaxplorer:
http://sourceforge.net/projects/ajaxplorer/?source=directory [sourceforge.net]
Simple to use browser app, and there are iOS and Android apps that do a great job.

Extplorer:
http://sourceforge.net/projects/extplorer/?source=recommended [sourceforge.net]
Better support for larger quantities of files and browsing using a traditional tree/file pane, but slightly more complicated UI due to the smaller, more nebulous buttons.

As far as getting it to run on something, your best bet is to either try XAMPP, or better yet (if you've got the RAM for it and enough hard disk space), grab a copy of VirtualBox and head over to TurnKeyLinux.org, where they've got pre-configured LAMP stacks with plenty of browser based applications, including Ajaxplorer, which you can have up and running, perfectly configured, in twenty minutes or less =).

Re:Don't trust the cloud (1)

Anonymous Coward | about 2 years ago | (#42578047)

You may want to change your ftp and ssh ports to be non-standard ones if you have an internet-facing home server. Automated port scans usually pick up such services and will brute force your connection 24/7, lowering your available bandwidth. Check your logs if in doubt.

spideroak (3, Informative)

characterZer0 (138196) | about 2 years ago | (#42577329)

Spideroak (http://www.spideroak.com) does what you want. It encryptes data on your machine before sending it to the cloud.

Re:spideroak (1)

Anonymous Coward | about 2 years ago | (#42577547)

So does Wuala: https://www.wuala.com/ [wuala.com]
I wonder why the poster didn't try searching the web for a phrase like "encrypted Dropbox alternative".

Re:spideroak (1)

man_of_mr_e (217855) | about 2 years ago | (#42578113)

Wuala is nice, but not widely supported by third party apps (particularly in the mobile space where you don't typically have control over where files are stored).

Re:spideroak (1)

icebike (68054) | about 2 years ago | (#42577581)

True, and I really like the fact that you can set it up to keep file changes so that you can step back in time to retrieve last weeks code base, or deleted files. It has lots of flexibility.

But it does not do the other half of the OPs requirements, of mirroring or raiding the data to multiple physical locations.

Re:spideroak (1)

characterZer0 (138196) | about 2 years ago | (#42577775)

If you keep a second machine up with the Spideroak program running, it will mirror your data. It would be nice if there was an option to run the program and pull the encrypted data but not decrypt it, so you would have the backup if Spideroak's systems go down, but your data would not be compromized if that machine was.

Re:spideroak (1)

icebike (68054) | about 2 years ago | (#42577909)

If you keep a second machine up with the Spideroak program running, it will mirror your data.

That is an option. But its not a requirement.
SpiderOak operates in three distinct modes
Backup
Sync
Share
Each are individually selectable.

Re:spideroak (0)

Anonymous Coward | about 2 years ago | (#42579985)

If you keep a second machine up with the Spideroak program running, it will mirror your data. It would be nice if there was an option to run the program and pull the encrypted data but not decrypt it, so you would have the backup if Spideroak's systems go down, but your data would not be compromized if that machine was.

It does have an option where you can select that it keeps a copy of the data blocks locally.

Re:spideroak (0)

Anonymous Coward | about 2 years ago | (#42579381)

I appreciate their privacy policy and attention to detail. I was so impressed I was ready to roll up my samba server at work and pitch this to my boss as an alternative for shared file storage. And then I looked at their pricing.

$600 per *month* for 1T plus $5 per user? I know they can provide more than my daily backup and weekly offsite copy, but I offer 1T, daily backup, raid mirroring for a tiny fraction of that, with only a few hours a month of checkup and testing. Honestly, there might be a few hours (a day on a weekend?) a year we are down due to a flaw, and we have to buy a couple of drives a couple of times over several years. But, compared to almost $22k for 3 years I don't know why a unit would consider this unless they needed 100% bullet proof faster than daily backup. And without competent supervision on the client end you won't even get that. (I mean, It won't be properly rolled out or managed/updated.) I'm not saying it's a bad option, just expensive for most recession era units. [Beyond this I can cheaply setup the most critical users with external drives and back them up to those, backup directories on their own machines and my own servers, etc., without coming within a minor fraction of that cost.]

really... (1)

CdBee (742846) | about 2 years ago | (#42577357)

... being a free software user doesnt mean you need to be a free service user: If you aren't paying, you aren't the customer.

I use both Google Drive & Dropbox (for different usage cases and purposes) but my really important backups - including everything from both the other two - go into Amazon S3, as I have a contract there with the supplier, and knowing I'm a paying customer of a profitable service means I'm much less likely to have to rethink my backup strategy due to a withdrawal of a free offer. The time spent doing an initial backup of all my files I want to protect means I dont want to have to do that often, incremental backups are much easier to live with.

preferably linux mountable (3, Informative)

vlm (69642) | about 2 years ago | (#42577365)

preferably linux mountable

You'll find a userspace script solution to be infinitely simpler. A script that clones such and such directory onto such and such other directory while encrypting is simple, another script to clone that encrypted directory into some other directory (basically just rsync). Run it periodically outta crontab, etc.

90% your effort will be expended on error detection / correction / reporting, 9% of your effort on key management for the encryption and keeping the individual services up and running, and probably about 1% on the actual nuts and bolts of copying stuff around while possibly encrypting.

There are more failure modes than you'd think... consider giant files, for example, which don't fit. Or running it outta crontab and somehow having two copies running simultaneously. Or your scratch directory is on a device that suddenly got remounted RO instead of RW due to developing hardware issues.

Bidirectional sync is ambitious but possible. You'll burn a seemingly infinite amount of bandwidth trying it (think about the next quote for a second)

The basic idea would be to create one file per cloud used as a block device

Thankfully you're just mirroring instead of requesting some kind of raid-5 like technology. Also you're just dumping "a big ole backup file" rather than individual files.

Encrypt it before you store it (1)

vlm (69642) | about 2 years ago | (#42577419)

encrypt the data stored in the cloud

Oh and another thing its infinitely more secure to encrypt the data before "putting it up on your homemade mirror network" rather than as a process.

For example, 99.99999999% of the data I "control" does not need to be encrypted. It just simply doesn't matter, even to a paranoid, although those know no rational limit....

Another example, lets say you were backing up a sql database of usernames/passwords for some site. The wrong way to do it is store the passwords in plain text and then encrypt the backup. Wrong for about a zillion (obvious?) reasons. If you have a decent system to hash and/or encrypt the data in the DB itself, thats much better, and no one can do anything with the encrypted data anyway. Or at least your database-level-backup script (as distinct from this project) can encrypt it for you (even if its just pipe mysqldump thru mcrypt and then into a file)

I just use emacs (1)

Billly Gates (198444) | about 2 years ago | (#42577425)

You may need another text editor though

Can be done with a FTPfs, raid and encfs (4, Interesting)

devitto (230479) | about 2 years ago | (#42577433)

Someone's already done & blogged about this, using multiple free FTP accounts, with a FTPfs bringing them local, then mounting a RAID (mirrored & parity) partition over it, and encfs over the top of that.

It was VERY SLOW, but did work, even when he blocked access to some of the FTP accounts - it was just seen as a failed drive read, and the parity reconstruction still permitted access.
I think the key problem was that FTP servers he used (or the FTPfs driver) didn't allow for partial writes to files, so every time you changed something, large amounts of data was re-uploaded. So there were possibilities for optimization.....

Enjoy & share if you get anywhere !

Dom

Re:Can be done with a FTPfs, raid and encfs (0)

Anonymous Coward | about 2 years ago | (#42577953)

Solvable using 512 or 4k (cluster)-sized files... you'll get a lot of files but that's more of a server problem as long as no directory listings are ever requested (directory traversal slowdown can be mitigated by grouping files into separate folders).

More to point on the author, a major bottleneck will be bandwidth. Be it your own connection or upload/download restrictions imposed by the server.

However, your biggest and most difficult problem will be dealing with failures. Network interruptions, corruptions, disconnections... A block device over the internet is in my opinion highly volatile. I've tried such a feat with NFS (fs, not even block device) over simple WiFi and it was utter disaster. Network failures caused the block driver to lock up on timeouts and enough writes failed to corrupt the file system completely. Raid might mitigate that but even with theoretical raid 9 (think raid 5 with 5x parity). It's highly likely to end up in a situation where an unfortunate failure during a rebuild causes the entire array to fail.

As for privacy / security... It's a bit of a moot point considering you can layer block devices on top of each other. Creating an encrypted block device on another block device on which the actual file system resides is trivial.

I do not know of any block-level driver that can handle excessive amounts of read and write failures and attempt recovery reliably. This tahoe-lafs seems promising but even that comes with it's own network protocol, making it entirely unusable as most (cheap) storage providers will usually only grant you http(s) or ftp access at best.

OwnCloud (0)

Anonymous Coward | about 2 years ago | (#42577451)

Give OwnCloud a try.

http://owncloud.org/

Truecrypt and Dropbox (1)

Billly Gates (198444) | about 2 years ago | (#42577463)

I use both and there are instructions here including a script [lifehacker.com] where you run l.bat to set it up and sync.

However, it seems your use case is a little different than a personal backup.

Cloud Striping (4, Funny)

lucm (889690) | about 2 years ago | (#42577469)

Forget redundancy, just go with "RAIC-0": unleashing the true power of the Cloud by striping providers!

Re:Cloud Striping (0)

Anonymous Coward | about 2 years ago | (#42577779)

That sound so awesome and so stupid at the same time.

Re:Cloud Striping (0)

Anonymous Coward | about 2 years ago | (#42579145)

To get best speed, make sure each cloud is on its own Internet.

Imagine a beawolf cluster of internets.

Winner

EXCITED KIDS (2)

petur (1833384) | about 2 years ago | (#42577553)

Just pay for it FFS, why try to combine different free services, and go throuth the trouble of running your own linux server in order to save 10$ a month oh my god, "!#$ excited kids.

CEPH/RADOS/RBD (2)

Heebie (1163973) | about 2 years ago | (#42577577)

You could use CEPH to do the distribution, then RADOS to create an RBD (Rados Block Device) and when you mount the RBD as asn iSCSI device, you could then build a cryptfs device on top of it, so the provider of the RBD couldn't read/write the data without the keys stored on your server (or wherever you keep them.) The difficulty is getting something like this that is product-ized, so that a provider can give enough economy-of-scale to make it really worthwhile.

Bitcasa (2, Interesting)

Anonymous Coward | about 2 years ago | (#42577737)

Bitcasa is an encrypted block based filesystem which mounts via FUSE and streams to the cloud behind the scenes. Has really intelligent caching built in and works with all major platforms (Lin, Win, Mac).

Linux client hasn't been updated as much as the other platforms but should catch up soon.

Full disclosure- I'm the CEO of Bitcasa.

Re:Bitcasa (0)

Anonymous Coward | about 2 years ago | (#42580127)

Bitcasa is an encrypted block based filesystem which mounts via FUSE and streams to the cloud behind the scenes. Has really intelligent caching built in and works with all major platforms (Lin, Win, Mac).

Linux client hasn't been updated as much as the other platforms but should catch up soon.

Full disclosure- I'm the CEO of Bitcasa.

My problem with bitcasa (and dropbox and the rest) is that they don't support "denyable-they-don't-have-the-encryption-key" security. Spider-oak's the only one that offers such security although their user interface needs a bit of work for more advanced tasks but for simple backup it suffices.

Why do this ? (5, Insightful)

Alain Williams (2972) | about 2 years ago | (#42577743)

He has not said why he wants to do this, ie what problem he is trying to solve. Depending on the question the answer may be different. Does he want a cloud because:

* data must be available from many places - ie over the Internet ?

* data is to be safe from one place (ie home/office machine) blowing up and losing everything ?

* fast access is needed from many places at once ?

Please first answer these questions so that we may provide you with what you need rather than random solutions that may not be what you need.

Re:Why do this ? (0)

swb (14022) | about 2 years ago | (#42578373)

Yes.

Re:Why do this ? (4, Insightful)

bazorg (911295) | about 2 years ago | (#42580395)

I'd wager that OP is more interested in using 5 free accounts supplying 10GB each than to pay a monthly rent for 50GB.

Truecrypt and custom share location (1)

ninlilizi (2759613) | about 2 years ago | (#42577767)

Most sharing services let you specify a shared directory location of your choice.
Why not just set them to all share the same directory. Then just create a Truecrypt crypt volume inside of it.
The better services remote differential compression will take care of the rest.

git-annex (0)

Anonymous Coward | about 2 years ago | (#42577873)

I manage my files using git-annex [branchable.com] which supports a number of storage services [branchable.com] with seamless GPG encryption.

The Cloud? LOL! (1)

Mister Liberty (769145) | about 2 years ago | (#42577899)

Avoid it.
Punt

LVM (1)

Scarred Intellect (1648867) | about 2 years ago | (#42578907)

Wouldn't a(n) LVM accomplish this? Set up a bunch of logical devices, put them into an LVM, and let that take care of itself?

OwnCloud? (4, Informative)

RanceJustice (2028040) | about 2 years ago | (#42579025)

I too have been looking for a solution for "denyable-they-don't-have-the-encryption-key" secure, remote storage, back ups and the like. Platform independent and standards compliance is important; I don't want to get locked into a proprietary ecosystem Its even better if there's a nice GUI and usability that doesn't require guru-level knowledge to access, and pricing isn't insane. Thus far I've found a handful of tools that seem to be the best of their breeds - CrashPlan for instance allows encrypted, secure multi-site backups (your own PCs, friends PCs, their servers), unlimited bandwidth/storage space etc... but it is only meant for backups, not sharing or accessing the data frequently. SpiderOak is a fantastic Dropbox alternative, Linux-friendly (both GUI and CLI for those interested) and seems to be amongst the best of the "Cloud (tm)/ Dropbox" type file-hosting/sharing services. However, as the OP specifically notes that they are looking for a unified solution to bring most or all of those remote hosted/"Cloud" stuff under a single mantle, there seems to be one project that has that goal in mind - OwnCloud

I've been watching OwnCloud (www.owncloud.org) since I heard of it, happy to see an open-source, standards-compliant, "installable on your own hardware as well as rented hosting etc.." universal, modular data storage/sync operation that can be totally under your own control. It has a ton of features, but most notable in this case is exactly what the OP wants: the ability to mount your Google Drive or Dropbox share and have your OwnCloud install interact with them. It looks to be a really promising project and I really hope that a lot of coding gurus join and take notice; if my skill was sufficient, I'd be looking to contribute. It is a relatively new platform and I am sure it will have some growing pains (ie. I do not know if it supports ALL "cloud drive" shares, for instance SpiderOak...), but it supports everything from a built in media player, Card/CalDAV, backups, LDAP, and seems to have amazing potential. I am told that Version 5.0 will be the next big leap forward in terms of polish. Check it out and those that can contribute, please do so. It seems the best option to have user-friendly, open source, secure "cloud" services without bolstering hegemony aspirations by Google, Microsoft, and many others.

FreeNAS + OpenVPN (2)

ternarybit (1363339) | about 2 years ago | (#42579127)

FreeNAS + OpenVPN is my "cloud" storage. Decent Comcast upstream at home means I have direct access to all my files anywhere, via a single UDP socket secured with certificate-based authentication and encryption. I take special solace knowing I own the hardware my data touches, and FDE on all endpoints ensures another layer of protection.

Curlftpfs + Archivemount will allow it on any ftp. (0)

Anonymous Coward | about 2 years ago | (#42579417)

Curlftpfs + Archivemount will allow it on any ftp. Just setup an account, mount it up with curlftpfs somewhere, create a tar file on the mount, and mount that to the folder you want.

This way you'll get all the posix privileges (user,group,others.rwxrwxrwx) and so on. Another advantage is that, theoretically, it should be possible to do the same thing on windows with a virtual mount. But I don't think the software for that is available FOSS or at all...

Re:Curlftpfs + Archivemount will allow it on any f (0)

Anonymous Coward | about 2 years ago | (#42579623)

I forgot to add that if you aren't satisfied with sftp or you don't trust the service provider, you don't have to ssh-tunnel the connection. You can preallocate a binary file with dd and then fuse-mount it as an encrypted file system of your choice.

Note that the real benefit of ftp is the lack of overhead. Those packets are as raw as it gets...

Market opportunity (0)

Anonymous Coward | about 2 years ago | (#42579479)

Sounds like a business opportunity to me - providing simple access to data in the Internet cloud that is mountable (either Linux or other) and strongly encrypted by default such that ONLY the owner or other authorized entity can access it in a decrypted form.

dic5k (-1)

Anonymous Coward | about 2 years ago | (#42579609)

OP wants to use existing free services (Dropbox et (0)

Anonymous Coward | about 2 years ago | (#42579619)

The OP wants to use free storage he already has (using Dropbox, Google Drive, SpiderOak, etc.), not pay for his own storage. While I would heartily recommend TAHOE-LAFS or sshfs+encfs or ownCloud or anything self-hosted that requires paying for storage, that's not what the OP is asking for.

Openstack swift (1)

GPLHost-Thomas (1330431) | about 2 years ago | (#42580131)

Probably, that's not what the OP is searching for, but Openstack swift is a very interesting cloud storage solution which has redundancy, so I thought it was a good idea to raise the topic in this thread.

S3QL (1)

mrvanes (658171) | about 2 years ago | (#42580151)

Have you looked at S3QL http://code.google.com/p/s3ql/ [google.com] ? Mountable infinite Amazon S3 storage via fuse (no limited blockdevice setup).
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?