wiki.volution.ro -- CiprianDorinCraciun/Notes/Ideas/WebFileSharing

Contents

Web file sharing

Web file sharing

Description

Provide a simple file sharing service that allows a user to share files with others with a basic level of confidentiality.

Requirements / constraints

allow exporting either individual files;
the files are stored directly on a file system (either local or networked);
the files to be exported can be anywhere on the file system (of course there are access policies applicable);
the files could be owned by different users;
the solution shouldn't require any privileged access (like root, special capabilities, etc.);
each exported file is accessible only to those that have the proper link, thus allowing a minimum level of confidentiality;
each file could be exported multiple times, under different URLs;

Assumptions

the files are managed (uploaded, etc.) by other means than the application (for example SCP, FTP, etc.);
the user management is done through other means (i.e. LDAP, passwd, etc.);
the path or contents of an exported file is assumed to not change;

Proposed solution

Workflow

the user uploads the file to be exported via SCP, FTP or any other mean allowed by the storage server;
the user accesses a simple web application which presents him with a simple file-browsing interface where he identifies the file to be exported (which was uploaded at the previous step);
the user chooses to export that particular file, and obtains a unique, secret, URL;
the user sends the obtained URL to any other person (called "guest") that needs to access that particular file;
the guest accesses the URL and downloads the file;
the URL expires in one month, or when the original file is removed;

Architecture

There are two machines:

a NAS, storing the actual files and the export metadata;
a web server (as in machine), accessing the files contents and maintaining the metadata;

There are three application servers:

one running on the NAS to upload the file content (as hinted SCP, FTP or others); this is already implemented and provided by the NAS;
"exporter" -- running on the web server, serving the actual exported file contents; this must be an open-source, robust, security conscious HTTP server, serving the exported files directly from the underlaying file system;
"manager" -- running on the web server, providing file-browsing and creating / exporting the URL; this must be implemented;

How things work

The URL of the exported files has the following syntax http://<host:port>/<token>/<file-name>, where:

the token is a 128 bit randomly generated sequence, in hexadecimal representation; (it must be generated with a cryptographic random number generator;)
the file-name is either the original file name (only the last path element) or is entered by the user, and should contain a proper extension; (it should contain only HTTP compliant characters, without slash or escaping, or it could be an even reduced set;)

There are three disjoint file-system hierarchies (they can be different mount points, or folders on the same device, but they must not be in a parent-child relationship):

"store" -- the first stores the underlaying file-system to be exported;
"links" -- the second stores only folders, named as the tokens above (fanned out based on the prefix), where each folder contains a symbolic link to the actual path on the first file-system;
"meta" -- the third (and optional) has a similar structure with the second one, but stores meta-data about the actual exported files, for example expiration, etc.

The "exporter" web server allows access only to paths respecting the rule above, disallowing listing at the root level.

The "manager" web server maintains the symlinks and metadata as described above, and based on authentication filters what files can be exported by each user.

Security

Preconditions:

each user has a dedicated folder inside the "store" -- the first file-system -- where he places the files to be exported;
each user has its own account, and any access (read or write) to his store is authenticated by the NAS system;
each of the two web servers (the "exporter" and the "manager") has its own account both on the web server machine and the NAS system;
each user has a separate account for the "manager", and is managed by this application;

Needed file-system privileges:

for the "exporter", the static web server:
- read only access to all files exported on the "store" file-system;
- execute only access to all paths from the root to the actual file to be exported on the "store" file-system;
- read only access to all the symbolic links in the "links" file-system;
- execute only access to the first level of folders in the "links" file-system (i.e. only the folder containing all the fanout folders, and then these fanout folders themselves);
- read and execute access on the second level folder (the one containing the symbolic links) on the "links" file-system;
for the "manager", the frontend web server:
- read access on all the files and (plus execute on) folders in the "store" file-system; (of course it can be denied access to some of them, but those won't be exported;)
- read and write on both the "links" and "meta" file systems; (for better security, the first fanout level could be pre-created and only that writable, but this adds minimum security;)

Authentication:

the "manager" employs HTTPS for authenticating himself, and providing confidentiality of the operations;
the "manager" uses HTTP digest authentication to authenticate its users;
the "manager" requires authentication for each and every request;

Miscellaneous:

both the "exporter" and the "manager" web servers are security conscious implementations;
the "manager" logic code does not run inside the web server, but is either CGI-based or reverse-proxied;

Analysis:

because the token is randomly generated with a proper length, it is infeasible to try to "guess" what files are exported; thus we can assume that all exported files are accessed only by those that know about the token; (thus we have a rudimentary access control;)
because the "guests" are not authenticated anyone with the token could share it with another person, or make it public, thus compromising the confidentiality; but this could easily be done by republishing the file, thus no mater what options we choose would have no impact;
we could employ HTTPS for the "exporter" but it would provide only marginal security, as the token could have already been transported over unencrypted channels (like email, chat, etc.);
if the "manager" logic code is compromised all shared files are accessible to the attacker; but because we separate the authentication (done by the web-server) and the logic (implemented in a different process) this could be done only by someone already having an account on the system; (thus it is less likely for someone from the inside to attack the system;)
(although unlikely) if someone compromises the "exporter" web server (thus assuming he can run arbitrary code under the privileges of the "exporter" user), the attacker couldn't obtain access to any file without knowing the secret token;
(although unlikely) if someone compromises the "manager" web server (not the logic code) then all shared files are accessible to the attacker;
(although very unlikely) if someone compromises any web server, and then he manages to elevate to root access, all bets are off...

Issues

how to authenticate the two web servers without hard-coding the NAS account passwords either in some scripts or configuration files?
as each user has its own folder inside the "store" file-system, his own folder should also be accessible inside his home folder; how to achieve this? (through symlinks inside the home folder to the "store"?)

Nice to have features

(The title actually translates as "never going to be implemented"...)

statistics of file access;
revocation of exported URLs;
upload support to authenticated users;
upload support to "guest" users (based on access tokens and quota);