// FIXME: if an exception is thrown, we shouldn't necessarily cache... // FIXME: there's some annoying duplication of code in the various versioned mains // add the Range header in there too. should return 206 // FIXME: cgi per-request arena allocator // i need to add a bunch of type templates for validations... mayne @NotNull or NotNull! // FIXME: I might make a cgi proxy class which can change things; the underlying one is still immutable // but the later one can edit and simplify the api. You'd have to use the subclass tho! /* void foo(int f, @("test") string s) {} void main() { static if(is(typeof(foo) Params == __parameters)) //pragma(msg, __traits(getAttributes, Params[0])); pragma(msg, __traits(getAttributes, Params[1..2])); else pragma(msg, "fail"); } */ // Note: spawn-fcgi can help with fastcgi on nginx // FIXME: to do: add openssl optionally // make sure embedded_httpd doesn't send two answers if one writes() then dies // future direction: websocket as a separate process that you can sendfile to for an async passoff of those long-lived connections /* Session manager process: it spawns a new process, passing a command line argument, to just be a little key/value store of some serializable struct. On Windows, it CreateProcess. On Linux, it can just fork or maybe fork/exec. The session key is in a cookie. Server-side event process: spawns an async manager. You can push stuff out to channel ids and the clients listen to it. websocket process: spawns an async handler. They can talk to each other or get info from a cgi request. Tempting to put web.d 2.0 in here. It would: * map urls and form generation to functions * have data presentation magic * do the skeleton stuff like 1.0 * auto-cache generated stuff in files (at least if pure?) * introspect functions in json for consumers https://linux.die.net/man/3/posix_spawn */ /++ Provides a uniform server-side API for CGI, FastCGI, SCGI, and HTTP web applications. Offers both lower- and higher- level api options among other common (optional) things like websocket and event source serving support, session management, and job scheduling. --- import arsd.cgi; // Instead of writing your own main(), you should write a function // that takes a Cgi param, and use mixin GenericMain // for maximum compatibility with different web servers. void hello(Cgi cgi) { cgi.setResponseContentType("text/plain"); if("name" in cgi.get) cgi.write("Hello, " ~ cgi.get["name"]); else cgi.write("Hello, world!"); } mixin GenericMain!hello; --- Or: --- import arsd.cgi; class MyApi : WebObject { @UrlName("") string hello(string name = null) { if(name is null) return "Hello, world!"; else return "Hello, " ~ name; } } mixin DispatcherMain!( "/".serveApi!MyApi ); --- $(NOTE Please note that using the higher-level api will add a dependency on arsd.dom and arsd.jsvar to your application. If you use `dmd -i` or `ldc2 -i` to build, it will just work, but with dub, you will have do `dub add arsd-official:jsvar` and `dub add arsd-official:dom` yourself. ) Test on console (works in any interface mode): $(CONSOLE $ ./cgi_hello GET / name=whatever ) If using http version (default on `dub` builds, or on custom builds when passing `-version=embedded_httpd` to dmd): $(CONSOLE $ ./cgi_hello --port 8080 # now you can go to http://localhost:8080/?name=whatever ) Please note: the default port for http is 8085 and for scgi is 4000. I recommend you set your own by the command line argument in a startup script instead of relying on any hard coded defaults. It is possible though to code your own with [RequestServer], however. Build_Configurations: cgi.d tries to be flexible to meet your needs. It is possible to configure it both at runtime (by writing your own `main` function and constructing a [RequestServer] object) or at compile time using the `version` switch to the compiler or a dub `subConfiguration`. If you are using `dub`, use: ```sdlang subConfiguration "arsd-official:cgi" "VALUE_HERE" ``` or to dub.json: ```json "subConfigurations": {"arsd-official:cgi": "VALUE_HERE"} ``` to change versions. The possible options for `VALUE_HERE` are: $(LIST * `embedded_httpd` for the embedded httpd version (built-in web server). This is the default for dub builds. You can run the program then connect directly to it from your browser. Note: prior to version 11, this would be embedded_httpd_processes on Linux and embedded_httpd_threads everywhere else. It now means embedded_httpd_hybrid everywhere supported and embedded_httpd_threads everywhere else. * `cgi` for traditional cgi binaries. These are run by an outside web server as-needed to handle requests. * `fastcgi` for FastCGI builds. FastCGI is managed from an outside helper, there's one built into Microsoft IIS, Apache httpd, and Lighttpd, and a generic program you can use with nginx called `spawn-fcgi`. If you don't already know how to use it, I suggest you use one of the other modes. * `scgi` for SCGI builds. SCGI is a simplified form of FastCGI, where you run the server as an application service which is proxied by your outside webserver. * `stdio_http` for speaking raw http over stdin and stdout. This is made for systemd services. See [RequestServer.serveSingleHttpConnectionOnStdio] for more information. ) With dmd, use: $(TABLE_ROWS * + Interfaces + (mutually exclusive) * - `-version=plain_cgi` - The default building the module alone without dub - a traditional, plain CGI executable will be generated. * - `-version=embedded_httpd` - A HTTP server will be embedded in the generated executable. This is default when building with dub. * - `-version=fastcgi` - A FastCGI executable will be generated. * - `-version=scgi` - A SCGI (SimpleCGI) executable will be generated. * - `-version=embedded_httpd_hybrid` - A HTTP server that uses a combination of processes, threads, and fibers to better handle large numbers of idle connections. Recommended if you are going to serve websockets in a non-local application. * - `-version=embedded_httpd_threads` - The embedded HTTP server will use a single process with a thread pool. (use instead of plain `embedded_httpd` if you want this specific implementation) * - `-version=embedded_httpd_processes` - The embedded HTTP server will use a prefork style process pool. (use instead of plain `embedded_httpd` if you want this specific implementation) * - `-version=embedded_httpd_processes_accept_after_fork` - It will call accept() in each child process, after forking. This is currently the only option, though I am experimenting with other ideas. You probably should NOT specify this right now. * - `-version=stdio_http` - The embedded HTTP server will be spoken over stdin and stdout. * + Tweaks + (can be used together with others) * - `-version=cgi_with_websocket` - The CGI class has websocket server support. (This is on by default now.) * - `-version=with_openssl` - not currently used * - `-version=cgi_embedded_sessions` - The session server will be embedded in the cgi.d server process * - `-version=cgi_session_server_process` - The session will be provided in a separate process, provided by cgi.d. ) For example, For CGI, `dmd yourfile.d cgi.d` then put the executable in your cgi-bin directory. For FastCGI: `dmd yourfile.d cgi.d -version=fastcgi` and run it. spawn-fcgi helps on nginx. You can put the file in the directory for Apache. On IIS, run it with a port on the command line (this causes it to call FCGX_OpenSocket, which can work on nginx too). For SCGI: `dmd yourfile.d cgi.d -version=scgi` and run the executable, providing a port number on the command line. For an embedded HTTP server, run `dmd yourfile.d cgi.d -version=embedded_httpd` and run the generated program. It listens on port 8085 by default. You can change this on the command line with the --port option when running your program. Simulating_requests: If you are using one of the [GenericMain] or [DispatcherMain] mixins, or main with your own call to [RequestServer.trySimulatedRequest], you can simulate requests from your command-ine shell. Call the program like this: $(CONSOLE ./yourprogram GET / name=adr ) And it will print the result to stdout instead of running a server, regardless of build more.. CGI_Setup_tips: On Apache, you may do `SetHandler cgi-script` in your `.htaccess` file to set a particular file to be run through the cgi program. Note that all "subdirectories" of it also run the program; if you configure `/foo` to be a cgi script, then going to `/foo/bar` will call your cgi handler function with `cgi.pathInfo == "/bar"`. Overview_Of_Basic_Concepts: cgi.d offers both lower-level handler apis as well as higher-level auto-dispatcher apis. For a lower-level handler function, you'll probably want to review the following functions: Input: [Cgi.get], [Cgi.post], [Cgi.request], [Cgi.files], [Cgi.cookies], [Cgi.pathInfo], [Cgi.requestMethod], and HTTP headers ([Cgi.headers], [Cgi.userAgent], [Cgi.referrer], [Cgi.accept], [Cgi.authorization], [Cgi.lastEventId]) Output: [Cgi.write], [Cgi.header], [Cgi.setResponseStatus], [Cgi.setResponseContentType], [Cgi.gzipResponse] Cookies: [Cgi.setCookie], [Cgi.clearCookie], [Cgi.cookie], [Cgi.cookies] Caching: [Cgi.setResponseExpires], [Cgi.updateResponseExpires], [Cgi.setCache] Redirections: [Cgi.setResponseLocation] Other Information: [Cgi.remoteAddress], [Cgi.https], [Cgi.port], [Cgi.scriptName], [Cgi.requestUri], [Cgi.getCurrentCompleteUri], [Cgi.onRequestBodyDataReceived] Websockets: [Websocket], [websocketRequested], [acceptWebsocket]. For websockets, use the `embedded_httpd_hybrid` build mode for best results, because it is optimized for handling large numbers of idle connections compared to the other build modes. Overriding behavior for special cases streaming input data: see the virtual functions [Cgi.handleIncomingDataChunk], [Cgi.prepareForIncomingDataChunks], [Cgi.cleanUpPostDataState] A basic program using the lower-level api might look like: --- import arsd.cgi; // you write a request handler which always takes a Cgi object void handler(Cgi cgi) { /+ when the user goes to your site, suppose you are being hosted at http://example.com/yourapp If the user goes to http://example.com/yourapp/test?name=value then the url will be parsed out into the following pieces: cgi.pathInfo == "/test". This is everything after yourapp's name. (If you are doing an embedded http server, your app's name is blank, so pathInfo will be the whole path of the url.) cgi.scriptName == "yourapp". With an embedded http server, this will be blank. cgi.host == "example.com" cgi.https == false cgi.queryString == "name=value" (there's also cgi.search, which will be "?name=value", including the ?) The query string is further parsed into the `get` and `getArray` members, so: cgi.get == ["name": "value"], meaning you can do `cgi.get["name"] == "value"` And cgi.getArray == ["name": ["value"]]. Why is there both `get` and `getArray`? The standard allows names to be repeated. This can be very useful, it is how http forms naturally pass multiple items like a set of checkboxes. So `getArray` is the complete data if you need it. But since so often you only care about one value, the `get` member provides more convenient access. We can use these members to process the request and build link urls. Other info from the request are in other members, we'll look at them later. +/ switch(cgi.pathInfo) { // the home page will be a small html form that can set a cookie. case "/": cgi.write(`
`, true); // the , true tells it that this is the one, complete response i want to send, allowing some optimizations. break; // POSTing to this will set a cookie with our submitted name case "/set-cookie": // HTTP has a number of request methods (also called "verbs") to tell // what you should do with the given resource. // The most common are GET and POST, the ones used in html forms. // You can check which one was used with the `cgi.requestMethod` property. if(cgi.requestMethod == Cgi.RequestMethod.POST) { // headers like redirections need to be set before we call `write` cgi.setResponseLocation("read-cookie"); // just like how url params go into cgi.get/getArray, form data submitted in a POST // body go to cgi.post/postArray. Please note that a POST request can also have get // params in addition to post params. // // There's also a convenience function `cgi.request("name")` which checks post first, // then get if it isn't found there, and then returns a default value if it is in neither. if("name" in cgi.post) { // we can set cookies with a method too // again, cookies need to be set before calling `cgi.write`, since they // are a kind of header. cgi.setCookie("name" , cgi.post["name"]); } // the user will probably never see this, since the response location // is an automatic redirect, but it is still best to say something anyway cgi.write("Redirecting you to see the cookie...", true); } else { // you can write out response codes and headers // as well as response bodies // // But always check the cgi docs before using the generic // `header` method - if there is a specific method for your // header, use it before resorting to the generic one to avoid // a header value from being sent twice. cgi.setResponseLocation("405 Method Not Allowed"); // there is no special accept member, so you can use the generic header function cgi.header("Accept: POST"); // but content type does have a method, so prefer to use it: cgi.setResponseContentType("text/plain"); // all the headers are buffered, and will be sent upon the first body // write. you can actually modify some of them before sending if need be. cgi.write("You must use the POST http verb on this resource.", true); } break; // and GETting this will read the cookie back out case "/read-cookie": // I did NOT pass `,true` here because this is writing a partial response. // It is possible to stream data to the user in chunks by writing partial // responses the calling `cgi.flush();` to send the partial response immediately. // normally, you'd only send partial chunks if you have to - it is better to build // a response as a whole and send it as a whole whenever possible - but here I want // to demo that you can. cgi.write("Hello, "); if("name" in cgi.cookies) { import arsd.dom; // dom.d provides a lot of helpers for html // since the cookie is set, we need to write it out properly to // avoid cross-site scripting attacks. // // Getting this stuff right automatically is a benefit of using the higher // level apis, but this demo is to show the fundamental building blocks, so // we're responsible to take care of it. cgi.write(htmlEntitiesEncode(cgi.cookies["name"])); } else { cgi.write("friend"); } // note that I never called cgi.setResponseContentType, since the default is text/html. // it doesn't hurt to do it explicitly though, just remember to do it before any cgi.write // calls. break; default: // no path matched cgi.setResponseStatus("404 Not Found"); cgi.write("Resource not found.", true); } } // and this adds the boilerplate to set up a server according to the // compile version configuration and call your handler as requests come in mixin GenericMain!handler; // the `handler` here is the name of your function --- Even if you plan to always use the higher-level apis, I still recommend you at least familiarize yourself with the lower level functions, since they provide the lightest weight, most flexible options to get down to business if you ever need them. In the lower-level api, the [Cgi] object represents your HTTP transaction. It has functions to describe the request and for you to send your response. It leaves the details of how you o it up to you. The general guideline though is to avoid depending any variables outside your handler function, since there's no guarantee they will survive to another handler. You can use global vars as a lazy initialized cache, but you should always be ready in case it is empty. (One exception: if you use `-version=embedded_httpd_threads -version=cgi_no_fork`, then you can rely on it more, but you should still really write things assuming your function won't have anything survive beyond its return for max scalability and compatibility.) A basic program using the higher-level apis might look like: --- /+ import arsd.cgi; struct LoginData { string currentUser; } class AppClass : WebObject { string foo() {} } mixin DispatcherMain!( "/assets/.serveStaticFileDirectory("assets/", true), // serve the files in the assets subdirectory "/".serveApi!AppClass, "/thing/".serveRestObject, ); +/ --- Guide_for_PHP_users: (Please note: I wrote this section in 2008. A lot of PHP hosts still ran 4.x back then, so it was common to avoid using classes - introduced in php 5 - to maintain compatibility! If you're coming from php more recently, this may not be relevant anymore, but still might help you.) If you are coming from old-style PHP, here's a quick guide to help you get started: $(SIDE_BY_SIDE $(COLUMN ```php ``` ) $(COLUMN --- import arsd.cgi; void app(Cgi cgi) { string foo = cgi.post["foo"]; string bar = cgi.get["bar"]; string baz = cgi.cookies["baz"]; string user_ip = cgi.remoteAddress; string host = cgi.host; string path = cgi.pathInfo; cgi.setCookie("baz", "some value"); cgi.write("hello!"); } mixin GenericMain!app --- ) ) $(H3 Array elements) In PHP, you can give a form element a name like `"something[]"`, and then `$_POST["something"]` gives an array. In D, you can use whatever name you want, and access an array of values with the `cgi.getArray["name"]` and `cgi.postArray["name"]` members. $(H3 Databases) PHP has a lot of stuff in its standard library. cgi.d doesn't include most of these, but the rest of my arsd repository has much of it. For example, to access a MySQL database, download `database.d` and `mysql.d` from my github repo, and try this code (assuming, of course, your database is set up): --- import arsd.cgi; import arsd.mysql; void app(Cgi cgi) { auto database = new MySql("localhost", "username", "password", "database_name"); foreach(row; mysql.query("SELECT count(id) FROM people")) cgi.write(row[0] ~ " people in database"); } mixin GenericMain!app; --- Similar modules are available for PostgreSQL, Microsoft SQL Server, and SQLite databases, implementing the same basic interface. See_Also: You may also want to see [arsd.dom], [arsd.webtemplate], and maybe some functions from my old [arsd.html] for more code for making web applications. dom and webtemplate are used by the higher-level api here in cgi.d. For working with json, try [arsd.jsvar]. [arsd.database], [arsd.mysql], [arsd.postgres], [arsd.mssql], and [arsd.sqlite] can help in accessing databases. If you are looking to access a web application via HTTP, try [arsd.http2]. Copyright: cgi.d copyright 2008-2023, Adam D. Ruppe. Provided under the Boost Software License. Yes, this file is old, and yes, it is still actively maintained and used. History: An import of `arsd.core` was added on March 21, 2023 (dub v11.0). Prior to this, the module's default configuration was completely stand-alone. You must now include the `core.d` file in your builds with `cgi.d`. This change is primarily to integrate the event loops across the library, allowing you to more easily use cgi.d along with my other libraries like simpledisplay and http2.d. Previously, you'd have to run separate helper threads. Now, they can all automatically work together. +/ module arsd.cgi; static import arsd.core; version(Posix) import arsd.core : makeNonBlocking; // FIXME: Nullable!T can be a checkbox that enables/disables the T on the automatic form // and a SumType!(T, R) can be a radio box to pick between T and R to disclose the extra boxes on the automatic form /++ This micro-example uses the [dispatcher] api to act as a simple http file server, serving files found in the current directory and its children. +/ version(Demo) unittest { import arsd.cgi; mixin DispatcherMain!( "/".serveStaticFileDirectory(null, true) ); } /++ Same as the previous example, but written out long-form without the use of [DispatcherMain] nor [GenericMain]. +/ version(Demo) unittest { import arsd.cgi; void requestHandler(Cgi cgi) { cgi.dispatcher!( "/".serveStaticFileDirectory(null, true) ); } // mixin GenericMain!requestHandler would add this function: void main(string[] args) { // this is all the content of [cgiMainImpl] which you can also call // cgi.d embeds a few add on functions like real time event forwarders // and session servers it can run in other processes. this spawns them, if needed. if(tryAddonServers(args)) return; // cgi.d allows you to easily simulate http requests from the command line, // without actually starting a server. this function will do that. if(trySimulatedRequest!(requestHandler, Cgi)(args)) return; RequestServer server; // you can change the default port here if you like // server.listeningPort = 9000; // then call this to let the command line args override your default server.configureFromCommandLine(args); // here is where you could print out the listeningPort to the user if you wanted // and serve the request(s) according to the compile configuration server.serve!(requestHandler)(); // or you could explicitly choose a serve mode like this: // server.serveEmbeddedHttp!requestHandler(); } } /++ cgi.d has built-in testing helpers too. These will provide mock requests and mock sessions that otherwise run through the rest of the internal mechanisms to call your functions without actually spinning up a server. +/ version(Demo) unittest { import arsd.cgi; void requestHandler(Cgi cgi) { } // D doesn't let me embed a unittest inside an example unittest // so this is a function, but you can do it however in your real program /* unittest */ void runTests() { auto tester = new CgiTester(&requestHandler); auto response = tester.GET("/"); assert(response.code == 200); } } static import std.file; // for a single thread, linear request thing, use: // -version=embedded_httpd_threads -version=cgi_no_threads version(Posix) { version(CRuntime_Musl) { } else version(minimal) { } else { version(FreeBSD) { // I never implemented the fancy stuff there either } else { version=with_breaking_cgi_features; version=with_sendfd; version=with_addon_servers; } } } version(Windows) { version(minimal) { } else { // not too concerned about gdc here since the mingw version is fairly new as well version=with_breaking_cgi_features; } } // FIXME: can use the arsd.core function now but it is trivial anyway tbh void cloexec(int fd) { version(Posix) { import core.sys.posix.fcntl; fcntl(fd, F_SETFD, FD_CLOEXEC); } } void cloexec(Socket s) { version(Posix) { import core.sys.posix.fcntl; fcntl(s.handle, F_SETFD, FD_CLOEXEC); } } // the servers must know about the connections to talk to them; the interfaces are vital version(with_addon_servers) version=with_addon_servers_connections; version(embedded_httpd) { version=embedded_httpd_hybrid; /* version(with_openssl) { pragma(lib, "crypto"); pragma(lib, "ssl"); } */ } version(embedded_httpd_hybrid) { version=embedded_httpd_threads; version(cgi_no_fork) {} else version(Posix) version=cgi_use_fork; version=cgi_use_fiber; } version(cgi_use_fork) enum cgi_use_fork_default = true; else enum cgi_use_fork_default = false; version(embedded_httpd_processes) version=embedded_httpd_processes_accept_after_fork; // I am getting much better average performance on this, so just keeping it. But the other way MIGHT help keep the variation down so i wanna keep the code to play with later version(embedded_httpd_threads) { // unless the user overrides the default.. version(cgi_session_server_process) {} else version=cgi_embedded_sessions; } version(scgi) { // unless the user overrides the default.. version(cgi_session_server_process) {} else version=cgi_embedded_sessions; } // fall back if the other is not defined so we can cleanly version it below version(cgi_embedded_sessions) {} else version=cgi_session_server_process; version=cgi_with_websocket; enum long defaultMaxContentLength = 5_000_000; /* To do a file download offer in the browser: cgi.setResponseContentType("text/csv"); cgi.header("Content-Disposition: attachment; filename=\"customers.csv\""); */ // FIXME: the location header is supposed to be an absolute url I guess. // FIXME: would be cool to flush part of a dom document before complete // somehow in here and dom.d. // these are public so you can mixin GenericMain. // FIXME: use a function level import instead! public import std.string; public import std.stdio; public import std.conv; import std.uri; import std.uni; import std.algorithm.comparison; import std.algorithm.searching; import std.exception; import std.base64; static import std.algorithm; import std.datetime; import std.range; import std.process; import std.zlib; T[] consume(T)(T[] range, int count) { if(count > range.length) count = range.length; return range[count..$]; } int locationOf(T)(T[] data, string item) { const(ubyte[]) d = cast(const(ubyte[])) data; const(ubyte[]) i = cast(const(ubyte[])) item; // this is a vague sanity check to ensure we aren't getting insanely // sized input that will infinite loop below. it should never happen; // even huge file uploads ought to come in smaller individual pieces. if(d.length > (int.max/2)) throw new Exception("excessive block of input"); for(int a = 0; a < d.length; a++) { if(a + i.length > d.length) return -1; if(d[a..a+i.length] == i) return a; } return -1; } /// If you are doing a custom cgi class, mixing this in can take care of /// the required constructors for you mixin template ForwardCgiConstructors() { this(long maxContentLength = defaultMaxContentLength, string[string] env = null, const(ubyte)[] delegate() readdata = null, void delegate(const(ubyte)[]) _rawDataOutput = null, void delegate() _flush = null ) { super(maxContentLength, env, readdata, _rawDataOutput, _flush); } this(string[] args) { super(args); } this( BufferedInputRange inputData, string address, ushort _port, int pathInfoStarts = 0, bool _https = false, void delegate(const(ubyte)[]) _rawDataOutput = null, void delegate() _flush = null, // this pointer tells if the connection is supposed to be closed after we handle this bool* closeConnection = null) { super(inputData, address, _port, pathInfoStarts, _https, _rawDataOutput, _flush, closeConnection); } this(BufferedInputRange ir, bool* closeConnection) { super(ir, closeConnection); } } /// thrown when a connection is closed remotely while we waiting on data from it class ConnectionClosedException : Exception { this(string message, string file = __FILE__, size_t line = __LINE__, Throwable next = null) { super(message, file, line, next); } } version(Windows) { // FIXME: ugly hack to solve stdin exception problems on Windows: // reading stdin results in StdioException (Bad file descriptor) // this is probably due to http://d.puremagic.com/issues/show_bug.cgi?id=3425 private struct stdin { struct ByChunk { // Replicates std.stdio.ByChunk private: ubyte[] chunk_; public: this(size_t size) in { assert(size, "size must be larger than 0"); } do { chunk_ = new ubyte[](size); popFront(); } @property bool empty() const { return !std.stdio.stdin.isOpen || std.stdio.stdin.eof; // Ugly, but seems to do the job } @property nothrow ubyte[] front() { return chunk_; } void popFront() { enforce(!empty, "Cannot call popFront on empty range"); chunk_ = stdin.rawRead(chunk_); } } import core.sys.windows.windows; static: T[] rawRead(T)(T[] buf) { uint bytesRead; auto result = ReadFile(GetStdHandle(STD_INPUT_HANDLE), buf.ptr, cast(int) (buf.length * T.sizeof), &bytesRead, null); if (!result) { auto err = GetLastError(); if (err == 38/*ERROR_HANDLE_EOF*/ || err == 109/*ERROR_BROKEN_PIPE*/) // 'good' errors meaning end of input return buf[0..0]; // Some other error, throw it char* buffer; scope(exit) LocalFree(buffer); // FORMAT_MESSAGE_ALLOCATE_BUFFER = 0x00000100 // FORMAT_MESSAGE_FROM_SYSTEM = 0x00001000 FormatMessageA(0x1100, null, err, 0, cast(char*)&buffer, 256, null); throw new Exception(to!string(buffer)); } enforce(!(bytesRead % T.sizeof), "I/O error"); return buf[0..bytesRead / T.sizeof]; } auto byChunk(size_t sz) { return ByChunk(sz); } void close() { std.stdio.stdin.close; } } } /// The main interface with the web request class Cgi { public: /// the methods a request can be enum RequestMethod { GET, HEAD, POST, PUT, DELETE, // GET and POST are the ones that really work // these are defined in the standard, but idk if they are useful for anything OPTIONS, TRACE, CONNECT, // These seem new, I have only recently seen them PATCH, MERGE, // this is an extension for when the method is not specified and you want to assume CommandLine } /+ /++ Cgi provides a per-request memory pool +/ void[] allocateMemory(size_t nBytes) { } /// ditto void[] reallocateMemory(void[] old, size_t nBytes) { } /// ditto void freeMemory(void[] memory) { } +/ /* import core.runtime; auto args = Runtime.args(); we can call the app a few ways: 1) set up the environment variables and call the app (manually simulating CGI) 2) simulate a call automatically: ./app method 'uri' for example: ./app get /path?arg arg2=something Anything on the uri is treated as query string etc on get method, further args are appended to the query string (encoded automatically) on post method, further args are done as post @name means import from file "name". if name == -, it uses stdin (so info=@- means set info to the value of stdin) Other arguments include: --cookie name=value (these are all concated together) --header 'X-Something: cool' --referrer 'something' --port 80 --remote-address some.ip.address.here --https yes --user-agent 'something' --userpass 'user:pass' --authorization 'Basic base64encoded_user:pass' --accept 'content' // FIXME: better example --last-event-id 'something' --host 'something.com' Non-simulation arguments: --port xxx listening port for non-cgi things (valid for the cgi interfaces) --listening-host the ip address the application should listen on, or if you want to use unix domain sockets, it is here you can set them: `--listening-host unix:filename` or, on Linux, `--listening-host abstract:name`. */ /** Initializes it with command line arguments (for easy testing) */ this(string[] args, void delegate(const(ubyte)[]) _rawDataOutput = null) { rawDataOutput = _rawDataOutput; // these are all set locally so the loop works // without triggering errors in dmd 2.064 // we go ahead and set them at the end of it to the this version int port; string referrer; string remoteAddress; string userAgent; string authorization; string origin; string accept; string lastEventId; bool https; string host; RequestMethod requestMethod; string requestUri; string pathInfo; string queryString; bool lookingForMethod; bool lookingForUri; string nextArgIs; string _cookie; string _queryString; string[][string] _post; string[string] _headers; string[] breakUp(string s) { string k, v; auto idx = s.indexOf("="); if(idx == -1) { k = s; } else { k = s[0 .. idx]; v = s[idx + 1 .. $]; } return [k, v]; } lookingForMethod = true; scriptName = args[0]; scriptFileName = args[0]; environmentVariables = cast(const) environment.toAA; foreach(arg; args[1 .. $]) { if(arg.startsWith("--")) { nextArgIs = arg[2 .. $]; } else if(nextArgIs.length) { if (nextArgIs == "cookie") { auto info = breakUp(arg); if(_cookie.length) _cookie ~= "; "; _cookie ~= std.uri.encodeComponent(info[0]) ~ "=" ~ std.uri.encodeComponent(info[1]); } else if (nextArgIs == "port") { port = to!int(arg); } else if (nextArgIs == "referrer") { referrer = arg; } else if (nextArgIs == "remote-address") { remoteAddress = arg; } else if (nextArgIs == "user-agent") { userAgent = arg; } else if (nextArgIs == "authorization") { authorization = arg; } else if (nextArgIs == "userpass") { authorization = "Basic " ~ Base64.encode(cast(immutable(ubyte)[]) (arg)).idup; } else if (nextArgIs == "origin") { origin = arg; } else if (nextArgIs == "accept") { accept = arg; } else if (nextArgIs == "last-event-id") { lastEventId = arg; } else if (nextArgIs == "https") { if(arg == "yes") https = true; } else if (nextArgIs == "header") { string thing, other; auto idx = arg.indexOf(":"); if(idx == -1) throw new Exception("need a colon in a http header"); thing = arg[0 .. idx]; other = arg[idx + 1.. $]; _headers[thing.strip.toLower()] = other.strip; } else if (nextArgIs == "host") { host = arg; } // else // skip, we don't know it but that's ok, it might be used elsewhere so no error nextArgIs = null; } else if(lookingForMethod) { lookingForMethod = false; lookingForUri = true; if(arg.asLowerCase().equal("commandline")) requestMethod = RequestMethod.CommandLine; else requestMethod = to!RequestMethod(arg.toUpper()); } else if(lookingForUri) { lookingForUri = false; requestUri = arg; auto idx = arg.indexOf("?"); if(idx == -1) pathInfo = arg; else { pathInfo = arg[0 .. idx]; _queryString = arg[idx + 1 .. $]; } } else { // it is an argument of some sort if(requestMethod == Cgi.RequestMethod.POST || requestMethod == Cgi.RequestMethod.PATCH || requestMethod == Cgi.RequestMethod.PUT || requestMethod == Cgi.RequestMethod.CommandLine) { auto parts = breakUp(arg); _post[parts[0]] ~= parts[1]; allPostNamesInOrder ~= parts[0]; allPostValuesInOrder ~= parts[1]; } else { if(_queryString.length) _queryString ~= "&"; auto parts = breakUp(arg); _queryString ~= std.uri.encodeComponent(parts[0]) ~ "=" ~ std.uri.encodeComponent(parts[1]); } } } acceptsGzip = false; keepAliveRequested = false; requestHeaders = cast(immutable) _headers; cookie = _cookie; cookiesArray = getCookieArray(); cookies = keepLastOf(cookiesArray); queryString = _queryString; getArray = cast(immutable) decodeVariables(queryString, "&", &allGetNamesInOrder, &allGetValuesInOrder); get = keepLastOf(getArray); postArray = cast(immutable) _post; post = keepLastOf(_post); // FIXME filesArray = null; files = null; isCalledWithCommandLineArguments = true; this.port = port; this.referrer = referrer; this.remoteAddress = remoteAddress; this.userAgent = userAgent; this.authorization = authorization; this.origin = origin; this.accept = accept; this.lastEventId = lastEventId; this.https = https; this.host = host; this.requestMethod = requestMethod; this.requestUri = requestUri; this.pathInfo = pathInfo; this.queryString = queryString; this.postBody = null; } private { string[] allPostNamesInOrder; string[] allPostValuesInOrder; string[] allGetNamesInOrder; string[] allGetValuesInOrder; } CgiConnectionHandle getOutputFileHandle() { return _outputFileHandle; } CgiConnectionHandle _outputFileHandle = INVALID_CGI_CONNECTION_HANDLE; /** Initializes it using a CGI or CGI-like interface */ this(long maxContentLength = defaultMaxContentLength, // use this to override the environment variable listing in string[string] env = null, // and this should return a chunk of data. return empty when done const(ubyte)[] delegate() readdata = null, // finally, use this to do custom output if needed void delegate(const(ubyte)[]) _rawDataOutput = null, // to flush teh custom output void delegate() _flush = null ) { // these are all set locally so the loop works // without triggering errors in dmd 2.064 // we go ahead and set them at the end of it to the this version int port; string referrer; string remoteAddress; string userAgent; string authorization; string origin; string accept; string lastEventId; bool https; string host; RequestMethod requestMethod; string requestUri; string pathInfo; string queryString; isCalledWithCommandLineArguments = false; rawDataOutput = _rawDataOutput; flushDelegate = _flush; auto getenv = delegate string(string var) { if(env is null) return std.process.environment.get(var); auto e = var in env; if(e is null) return null; return *e; }; environmentVariables = env is null ? cast(const) environment.toAA : env; // fetching all the request headers string[string] requestHeadersHere; foreach(k, v; env is null ? cast(const) environment.toAA() : env) { if(k.startsWith("HTTP_")) { requestHeadersHere[replace(k["HTTP_".length .. $].toLower(), "_", "-")] = v; } } this.requestHeaders = assumeUnique(requestHeadersHere); requestUri = getenv("REQUEST_URI"); cookie = getenv("HTTP_COOKIE"); cookiesArray = getCookieArray(); cookies = keepLastOf(cookiesArray); referrer = getenv("HTTP_REFERER"); userAgent = getenv("HTTP_USER_AGENT"); remoteAddress = getenv("REMOTE_ADDR"); host = getenv("HTTP_HOST"); pathInfo = getenv("PATH_INFO"); queryString = getenv("QUERY_STRING"); scriptName = getenv("SCRIPT_NAME"); { import core.runtime; auto sfn = getenv("SCRIPT_FILENAME"); scriptFileName = sfn.length ? sfn : (Runtime.args.length ? Runtime.args[0] : null); } bool iis = false; // Because IIS doesn't pass requestUri, we simulate it here if it's empty. if(requestUri.length == 0) { // IIS sometimes includes the script name as part of the path info - we don't want that if(pathInfo.length >= scriptName.length && (pathInfo[0 .. scriptName.length] == scriptName)) pathInfo = pathInfo[scriptName.length .. $]; requestUri = scriptName ~ pathInfo ~ (queryString.length ? ("?" ~ queryString) : ""); iis = true; // FIXME HACK - used in byChunk below - see bugzilla 6339 // FIXME: this works for apache and iis... but what about others? } auto ugh = decodeVariables(queryString, "&", &allGetNamesInOrder, &allGetValuesInOrder); getArray = assumeUnique(ugh); get = keepLastOf(getArray); // NOTE: on shitpache, you need to specifically forward this authorization = getenv("HTTP_AUTHORIZATION"); // this is a hack because Apache is a shitload of fuck and // refuses to send the real header to us. Compatible // programs should send both the standard and X- versions // NOTE: if you have access to .htaccess or httpd.conf, you can make this // unnecessary with mod_rewrite, so it is commented //if(authorization.length == 0) // if the std is there, use it // authorization = getenv("HTTP_X_AUTHORIZATION"); // the REDIRECT_HTTPS check is here because with an Apache hack, the port can become wrong if(getenv("SERVER_PORT").length && getenv("REDIRECT_HTTPS") != "on") port = to!int(getenv("SERVER_PORT")); else port = 0; // this was probably called from the command line auto ae = getenv("HTTP_ACCEPT_ENCODING"); if(ae.length && ae.indexOf("gzip") != -1) acceptsGzip = true; accept = getenv("HTTP_ACCEPT"); lastEventId = getenv("HTTP_LAST_EVENT_ID"); auto ka = getenv("HTTP_CONNECTION"); if(ka.length && ka.asLowerCase().canFind("keep-alive")) keepAliveRequested = true; auto or = getenv("HTTP_ORIGIN"); origin = or; auto rm = getenv("REQUEST_METHOD"); if(rm.length) requestMethod = to!RequestMethod(getenv("REQUEST_METHOD")); else requestMethod = RequestMethod.CommandLine; // FIXME: hack on REDIRECT_HTTPS; this is there because the work app uses mod_rewrite which loses the https flag! So I set it with [E=HTTPS=%HTTPS] or whatever but then it gets translated to here so i want it to still work. This is arguably wrong but meh. https = (getenv("HTTPS") == "on" || getenv("REDIRECT_HTTPS") == "on"); // FIXME: DOCUMENT_ROOT? // FIXME: what about PUT? if(requestMethod == RequestMethod.POST || requestMethod == Cgi.RequestMethod.PATCH || requestMethod == Cgi.RequestMethod.PUT || requestMethod == Cgi.RequestMethod.CommandLine) { version(preserveData) // a hack to make forwarding simpler immutable(ubyte)[] data; size_t amountReceived = 0; auto contentType = getenv("CONTENT_TYPE"); // FIXME: is this ever not going to be set? I guess it depends // on if the server de-chunks and buffers... seems like it has potential // to be slow if they did that. The spec says it is always there though. // And it has worked reliably for me all year in the live environment, // but some servers might be different. auto cls = getenv("CONTENT_LENGTH"); auto contentLength = to!size_t(cls.length ? cls : "0"); immutable originalContentLength = contentLength; if(contentLength) { if(maxContentLength > 0 && contentLength > maxContentLength) { setResponseStatus("413 Request entity too large"); write("You tried to upload a file that is too large."); close(); throw new Exception("POST too large"); } prepareForIncomingDataChunks(contentType, contentLength); int processChunk(in ubyte[] chunk) { if(chunk.length > contentLength) { handleIncomingDataChunk(chunk[0..contentLength]); amountReceived += contentLength; contentLength = 0; return 1; } else { handleIncomingDataChunk(chunk); contentLength -= chunk.length; amountReceived += chunk.length; } if(contentLength == 0) return 1; onRequestBodyDataReceived(amountReceived, originalContentLength); return 0; } if(readdata is null) { foreach(ubyte[] chunk; stdin.byChunk(iis ? contentLength : 4096)) if(processChunk(chunk)) break; } else { // we have a custom data source.. auto chunk = readdata(); while(chunk.length) { if(processChunk(chunk)) break; chunk = readdata(); } } onRequestBodyDataReceived(amountReceived, originalContentLength); postArray = assumeUnique(pps._post); filesArray = assumeUnique(pps._files); files = keepLastOf(filesArray); post = keepLastOf(postArray); this.postBody = pps.postBody; cleanUpPostDataState(); } version(preserveData) originalPostData = data; } // fixme: remote_user script name this.port = port; this.referrer = referrer; this.remoteAddress = remoteAddress; this.userAgent = userAgent; this.authorization = authorization; this.origin = origin; this.accept = accept; this.lastEventId = lastEventId; this.https = https; this.host = host; this.requestMethod = requestMethod; this.requestUri = requestUri; this.pathInfo = pathInfo; this.queryString = queryString; } /// Cleans up any temporary files. Do not use the object /// after calling this. /// /// NOTE: it is called automatically by GenericMain // FIXME: this should be called if the constructor fails too, if it has created some garbage... void dispose() { foreach(file; files) { if(!file.contentInMemory) if(std.file.exists(file.contentFilename)) std.file.remove(file.contentFilename); } } private { struct PostParserState { string contentType; string boundary; string localBoundary; // the ones used at the end or something lol bool isMultipart; bool needsSavedBody; ulong expectedLength; ulong contentConsumed; immutable(ubyte)[] buffer; // multipart parsing state int whatDoWeWant; bool weHaveAPart; string[] thisOnesHeaders; immutable(ubyte)[] thisOnesData; string postBody; UploadedFile piece; bool isFile = false; size_t memoryCommitted; // do NOT keep mutable references to these anywhere! // I assume they are unique in the constructor once we're all done getting data. string[][string] _post; UploadedFile[][string] _files; } PostParserState pps; } /// This represents a file the user uploaded via a POST request. static struct UploadedFile { /// If you want to create one of these structs for yourself from some data, /// use this function. static UploadedFile fromData(immutable(void)[] data, string name = null) { Cgi.UploadedFile f; f.filename = name; f.content = cast(immutable(ubyte)[]) data; f.contentInMemory = true; return f; } string name; /// The name of the form element. string filename; /// The filename the user set. string contentType; /// The MIME type the user's browser reported. (Not reliable.) /** For small files, cgi.d will buffer the uploaded file in memory, and make it directly accessible to you through the content member. I find this very convenient and somewhat efficient, since it can avoid hitting the disk entirely. (I often want to inspect and modify the file anyway!) I find the file is very large, it is undesirable to eat that much memory just for a file buffer. In those cases, if you pass a large enough value for maxContentLength to the constructor so they are accepted, cgi.d will write the content to a temporary file that you can re-read later. You can override this behavior by subclassing Cgi and overriding the protected handlePostChunk method. Note that the object is not initialized when you write that method - the http headers are available, but the cgi.post method is not. You may parse the file as it streams in using this method. Anyway, if the file is small enough to be in memory, contentInMemory will be set to true, and the content is available in the content member. If not, contentInMemory will be set to false, and the content saved in a file, whose name will be available in the contentFilename member. Tip: if you know you are always dealing with small files, and want the convenience of ignoring this member, construct Cgi with a small maxContentLength. Then, if a large file comes in, it simply throws an exception (and HTTP error response) instead of trying to handle it. The default value of maxContentLength in the constructor is for small files. */ bool contentInMemory = true; // the default ought to always be true immutable(ubyte)[] content; /// The actual content of the file, if contentInMemory == true string contentFilename; /// the file where we dumped the content, if contentInMemory == false. Note that if you want to keep it, you MUST move the file, since otherwise it is considered garbage when cgi is disposed. /// ulong fileSize() const { if(contentInMemory) return content.length; import std.file; return std.file.getSize(contentFilename); } /// void writeToFile(string filenameToSaveTo) const { import std.file; if(contentInMemory) std.file.write(filenameToSaveTo, content); else std.file.rename(contentFilename, filenameToSaveTo); } } // given a content type and length, decide what we're going to do with the data.. protected void prepareForIncomingDataChunks(string contentType, ulong contentLength) { pps.expectedLength = contentLength; auto terminator = contentType.indexOf(";"); if(terminator == -1) terminator = contentType.length; pps.contentType = contentType[0 .. terminator]; auto b = contentType[terminator .. $]; if(b.length) { auto idx = b.indexOf("boundary="); if(idx != -1) { pps.boundary = b[idx + "boundary=".length .. $]; pps.localBoundary = "\r\n--" ~ pps.boundary; } } // while a content type SHOULD be sent according to the RFC, it is // not required. We're told we SHOULD guess by looking at the content // but it seems to me that this only happens when it is urlencoded. if(pps.contentType == "application/x-www-form-urlencoded" || pps.contentType == "") { pps.isMultipart = false; pps.needsSavedBody = false; } else if(pps.contentType == "multipart/form-data") { pps.isMultipart = true; enforce(pps.boundary.length, "no boundary"); } else if(pps.contentType == "text/xml") { // FIXME: could this be special and load the post params // save the body so the application can handle it pps.isMultipart = false; pps.needsSavedBody = true; } else if(pps.contentType == "application/json") { // FIXME: this could prolly try to load post params too // save the body so the application can handle it pps.needsSavedBody = true; pps.isMultipart = false; } else { // the rest is 100% handled by the application. just save the body and send it to them pps.needsSavedBody = true; pps.isMultipart = false; } } // handles streaming POST data. If you handle some other content type, you should // override this. If the data isn't the content type you want, you ought to call // super.handleIncomingDataChunk so regular forms and files still work. // FIXME: I do some copying in here that I'm pretty sure is unnecessary, and the // file stuff I'm sure is inefficient. But, my guess is the real bottleneck is network // input anyway, so I'm not going to get too worked up about it right now. protected void handleIncomingDataChunk(const(ubyte)[] chunk) { if(chunk.length == 0) return; assert(chunk.length <= 32 * 1024 * 1024); // we use chunk size as a memory constraint thing, so // if we're passed big chunks, it might throw unnecessarily. // just pass it smaller chunks at a time. if(pps.isMultipart) { // multipart/form-data // FIXME: this might want to be factored out and factorized // need to make sure the stream hooks actually work. void pieceHasNewContent() { // we just grew the piece's buffer. Do we have to switch to file backing? if(pps.piece.contentInMemory) { if(pps.piece.content.length <= 10 * 1024 * 1024) // meh, I'm ok with it. return; else { // this is too big. if(!pps.isFile) throw new Exception("Request entity too large"); // a variable this big is kinda ridiculous, just reject it. else { // a file this large is probably acceptable though... let's use a backing file. pps.piece.contentInMemory = false; // FIXME: say... how do we intend to delete these things? cgi.dispose perhaps. int count = 0; pps.piece.contentFilename = getTempDirectory() ~ "arsd_cgi_uploaded_file_" ~ to!string(getUtcTime()) ~ "-" ~ to!string(count); // odds are this loop will never be entered, but we want it just in case. while(std.file.exists(pps.piece.contentFilename)) { count++; pps.piece.contentFilename = getTempDirectory() ~ "arsd_cgi_uploaded_file_" ~ to!string(getUtcTime()) ~ "-" ~ to!string(count); } // I hope this creates the file pretty quickly, or the loop might be useless... // FIXME: maybe I should write some kind of custom transaction here. std.file.write(pps.piece.contentFilename, pps.piece.content); pps.piece.content = null; } } } else { // it's already in a file, so just append it to what we have if(pps.piece.content.length) { // FIXME: this is surely very inefficient... we'll be calling this by 4kb chunk... std.file.append(pps.piece.contentFilename, pps.piece.content); pps.piece.content = null; } } } void commitPart() { if(!pps.weHaveAPart) return; pieceHasNewContent(); // be sure the new content is handled every time if(pps.isFile) { // I'm not sure if other environments put files in post or not... // I used to not do it, but I think I should, since it is there... pps._post[pps.piece.name] ~= pps.piece.filename; pps._files[pps.piece.name] ~= pps.piece; allPostNamesInOrder ~= pps.piece.name; allPostValuesInOrder ~= pps.piece.filename; } else { pps._post[pps.piece.name] ~= cast(string) pps.piece.content; allPostNamesInOrder ~= pps.piece.name; allPostValuesInOrder ~= cast(string) pps.piece.content; } /* stderr.writeln("RECEIVED: ", pps.piece.name, "=", pps.piece.content.length < 1000 ? to!string(pps.piece.content) : "too long"); */ // FIXME: the limit here pps.memoryCommitted += pps.piece.content.length; pps.weHaveAPart = false; pps.whatDoWeWant = 1; pps.thisOnesHeaders = null; pps.thisOnesData = null; pps.piece = UploadedFile.init; pps.isFile = false; } void acceptChunk() { pps.buffer ~= chunk; chunk = null; // we've consumed it into the buffer, so keeping it just brings confusion } immutable(ubyte)[] consume(size_t howMuch) { pps.contentConsumed += howMuch; auto ret = pps.buffer[0 .. howMuch]; pps.buffer = pps.buffer[howMuch .. $]; return ret; } dataConsumptionLoop: do { switch(pps.whatDoWeWant) { default: assert(0); case 0: acceptChunk(); // the format begins with two extra leading dashes, then we should be at the boundary if(pps.buffer.length < 2) return; assert(pps.buffer[0] == '-', "no leading dash"); consume(1); assert(pps.buffer[0] == '-', "no second leading dash"); consume(1); pps.whatDoWeWant = 1; goto case 1; /* fallthrough */ case 1: // looking for headers // here, we should be lined up right at the boundary, which is followed by a \r\n // want to keep the buffer under control in case we're under attack //stderr.writeln("here once"); //if(pps.buffer.length + chunk.length > 70 * 1024) // they should be < 1 kb really.... // throw new Exception("wtf is up with the huge mime part headers"); acceptChunk(); if(pps.buffer.length < pps.boundary.length) return; // not enough data, since there should always be a boundary here at least if(pps.contentConsumed + pps.boundary.length + 6 == pps.expectedLength) { assert(pps.buffer.length == pps.boundary.length + 4 + 2); // --, --, and \r\n // we *should* be at the end here! assert(pps.buffer[0] == '-'); consume(1); assert(pps.buffer[0] == '-'); consume(1); // the message is terminated by --BOUNDARY--\r\n (after a \r\n leading to the boundary) assert(pps.buffer[0 .. pps.boundary.length] == cast(const(ubyte[])) pps.boundary, "not lined up on boundary " ~ pps.boundary); consume(pps.boundary.length); assert(pps.buffer[0] == '-'); consume(1); assert(pps.buffer[0] == '-'); consume(1); assert(pps.buffer[0] == '\r'); consume(1); assert(pps.buffer[0] == '\n'); consume(1); assert(pps.buffer.length == 0); assert(pps.contentConsumed == pps.expectedLength); break dataConsumptionLoop; // we're done! } else { // we're not done yet. We should be lined up on a boundary. // But, we want to ensure the headers are here before we consume anything! auto headerEndLocation = locationOf(pps.buffer, "\r\n\r\n"); if(headerEndLocation == -1) return; // they *should* all be here, so we can handle them all at once. assert(pps.buffer[0 .. pps.boundary.length] == cast(const(ubyte[])) pps.boundary, "not lined up on boundary " ~ pps.boundary); consume(pps.boundary.length); // the boundary is always followed by a \r\n assert(pps.buffer[0] == '\r'); consume(1); assert(pps.buffer[0] == '\n'); consume(1); } // re-running since by consuming the boundary, we invalidate the old index. auto headerEndLocation = locationOf(pps.buffer, "\r\n\r\n"); assert(headerEndLocation >= 0, "no header"); auto thisOnesHeaders = pps.buffer[0..headerEndLocation]; consume(headerEndLocation + 4); // The +4 is the \r\n\r\n that caps it off pps.thisOnesHeaders = split(cast(string) thisOnesHeaders, "\r\n"); // now we'll parse the headers foreach(h; pps.thisOnesHeaders) { auto p = h.indexOf(":"); assert(p != -1, "no colon in header, got " ~ to!string(pps.thisOnesHeaders)); string hn = h[0..p]; string hv = h[p+2..$]; switch(hn.toLower) { default: assert(0); case "content-disposition": auto info = hv.split("; "); foreach(i; info[1..$]) { // skipping the form-data auto o = i.split("="); // FIXME string pn = o[0]; string pv = o[1][1..$-1]; if(pn == "name") { pps.piece.name = pv; } else if (pn == "filename") { pps.piece.filename = pv; pps.isFile = true; } } break; case "content-type": pps.piece.contentType = hv; break; } } pps.whatDoWeWant++; // move to the next step - the data break; case 2: // when we get here, pps.buffer should contain our first chunk of data if(pps.buffer.length + chunk.length > 8 * 1024 * 1024) // we might buffer quite a bit but not much throw new Exception("wtf is up with the huge mime part buffer"); acceptChunk(); // so the trick is, we want to process all the data up to the boundary, // but what if the chunk's end cuts the boundary off? If we're unsure, we // want to wait for the next chunk. We start by looking for the whole boundary // in the buffer somewhere. auto boundaryLocation = locationOf(pps.buffer, pps.localBoundary); // assert(boundaryLocation != -1, "should have seen "~to!string(cast(ubyte[]) pps.localBoundary)~" in " ~ to!string(pps.buffer)); if(boundaryLocation != -1) { // this is easy - we can see it in it's entirety! pps.piece.content ~= consume(boundaryLocation); assert(pps.buffer[0] == '\r'); consume(1); assert(pps.buffer[0] == '\n'); consume(1); assert(pps.buffer[0] == '-'); consume(1); assert(pps.buffer[0] == '-'); consume(1); // the boundary here is always preceded by \r\n--, which is why we used localBoundary instead of boundary to locate it. Cut that off. pps.weHaveAPart = true; pps.whatDoWeWant = 1; // back to getting headers for the next part commitPart(); // we're done here } else { // we can't see the whole thing, but what if there's a partial boundary? enforce(pps.localBoundary.length < 128); // the boundary ought to be less than a line... assert(pps.localBoundary.length > 1); // should already be sane but just in case bool potentialBoundaryFound = false; boundaryCheck: for(int a = 1; a < pps.localBoundary.length; a++) { // we grow the boundary a bit each time. If we think it looks the // same, better pull another chunk to be sure it's not the end. // Starting small because exiting the loop early is desirable, since // we're not keeping any ambiguity and 1 / 256 chance of exiting is // the best we can do. if(a > pps.buffer.length) break; // FIXME: is this right? assert(a <= pps.buffer.length); assert(a > 0); if(std.algorithm.endsWith(pps.buffer, pps.localBoundary[0 .. a])) { // ok, there *might* be a boundary here, so let's // not treat the end as data yet. The rest is good to // use though, since if there was a boundary there, we'd // have handled it up above after locationOf. pps.piece.content ~= pps.buffer[0 .. $ - a]; consume(pps.buffer.length - a); pieceHasNewContent(); potentialBoundaryFound = true; break boundaryCheck; } } if(!potentialBoundaryFound) { // we can consume the whole thing pps.piece.content ~= pps.buffer; pieceHasNewContent(); consume(pps.buffer.length); } else { // we found a possible boundary, but there was // insufficient data to be sure. assert(pps.buffer == cast(const(ubyte[])) pps.localBoundary[0 .. pps.buffer.length]); return; // wait for the next chunk. } } } } while(pps.buffer.length); // btw all boundaries except the first should have a \r\n before them } else { // application/x-www-form-urlencoded and application/json // not using maxContentLength because that might be cranked up to allow // large file uploads. We can handle them, but a huge post[] isn't any good. if(pps.buffer.length + chunk.length > 8 * 1024 * 1024) // surely this is plenty big enough throw new Exception("wtf is up with such a gigantic form submission????"); pps.buffer ~= chunk; // simple handling, but it works... until someone bombs us with gigabytes of crap at least... if(pps.buffer.length == pps.expectedLength) { if(pps.needsSavedBody) pps.postBody = cast(string) pps.buffer; else pps._post = decodeVariables(cast(string) pps.buffer, "&", &allPostNamesInOrder, &allPostValuesInOrder); version(preserveData) originalPostData = pps.buffer; } else { // just for debugging } } } protected void cleanUpPostDataState() { pps = PostParserState.init; } /// you can override this function to somehow react /// to an upload in progress. /// /// Take note that parts of the CGI object is not yet /// initialized! Stuff from HTTP headers, including get[], is usable. /// But, none of post[] is usable, and you cannot write here. That's /// why this method is const - mutating the object won't do much anyway. /// /// My idea here was so you can output a progress bar or /// something to a cooperative client (see arsd.rtud for a potential helper) /// /// The default is to do nothing. Subclass cgi and use the /// CustomCgiMain mixin to do something here. void onRequestBodyDataReceived(size_t receivedSoFar, size_t totalExpected) const { // This space intentionally left blank. } /// Initializes the cgi from completely raw HTTP data. The ir must have a Socket source. /// *closeConnection will be set to true if you should close the connection after handling this request this(BufferedInputRange ir, bool* closeConnection) { isCalledWithCommandLineArguments = false; import al = std.algorithm; immutable(ubyte)[] data; void rdo(const(ubyte)[] d) { //import std.stdio; writeln(d); sendAll(ir.source, d); } auto ira = ir.source.remoteAddress(); auto irLocalAddress = ir.source.localAddress(); ushort port = 80; if(auto ia = cast(InternetAddress) irLocalAddress) { port = ia.port; } else if(auto ia = cast(Internet6Address) irLocalAddress) { port = ia.port; } // that check for UnixAddress is to work around a Phobos bug // see: https://github.com/dlang/phobos/pull/7383 // but this might be more useful anyway tbh for this case version(Posix) this(ir, ira is null ? null : cast(UnixAddress) ira ? "unix:" : ira.toString(), port, 0, false, &rdo, null, closeConnection); else this(ir, ira is null ? null : ira.toString(), port, 0, false, &rdo, null, closeConnection); } /** Initializes it from raw HTTP request data. GenericMain uses this when you compile with -version=embedded_httpd. NOTE: If you are behind a reverse proxy, the values here might not be what you expect.... it will use X-Forwarded-For for remote IP and X-Forwarded-Host for host Params: inputData = the incoming data, including headers and other raw http data. When the constructor exits, it will leave this range exactly at the start of the next request on the connection (if there is one). address = the IP address of the remote user _port = the port number of the connection pathInfoStarts = the offset into the path component of the http header where the SCRIPT_NAME ends and the PATH_INFO begins. _https = if this connection is encrypted (note that the input data must not actually be encrypted) _rawDataOutput = delegate to accept response data. It should write to the socket or whatever; Cgi does all the needed processing to speak http. _flush = if _rawDataOutput buffers, this delegate should flush the buffer down the wire closeConnection = if the request asks to close the connection, *closeConnection == true. */ this( BufferedInputRange inputData, // string[] headers, immutable(ubyte)[] data, string address, ushort _port, int pathInfoStarts = 0, // use this if you know the script name, like if this is in a folder in a bigger web environment bool _https = false, void delegate(const(ubyte)[]) _rawDataOutput = null, void delegate() _flush = null, // this pointer tells if the connection is supposed to be closed after we handle this bool* closeConnection = null) { // these are all set locally so the loop works // without triggering errors in dmd 2.064 // we go ahead and set them at the end of it to the this version int port; string referrer; string remoteAddress; string userAgent; string authorization; string origin; string accept; string lastEventId; bool https; string host; RequestMethod requestMethod; string requestUri; string pathInfo; string queryString; string scriptName; string[string] get; string[][string] getArray; bool keepAliveRequested; bool acceptsGzip; string cookie; environmentVariables = cast(const) environment.toAA; idlol = inputData; isCalledWithCommandLineArguments = false; https = _https; port = _port; rawDataOutput = _rawDataOutput; flushDelegate = _flush; nph = true; remoteAddress = address; // streaming parser import al = std.algorithm; // FIXME: tis cast is technically wrong, but Phobos deprecated al.indexOf... for some reason. auto idx = indexOf(cast(string) inputData.front(), "\r\n\r\n"); while(idx == -1) { inputData.popFront(0); idx = indexOf(cast(string) inputData.front(), "\r\n\r\n"); } assert(idx != -1); string contentType = ""; string[string] requestHeadersHere; size_t contentLength; bool isChunked; { import core.runtime; scriptFileName = Runtime.args.length ? Runtime.args[0] : null; } int headerNumber = 0; foreach(line; al.splitter(inputData.front()[0 .. idx], "\r\n")) if(line.length) { headerNumber++; auto header = cast(string) line.idup; if(headerNumber == 1) { // request line auto parts = al.splitter(header, " "); if(parts.front == "PRI") { // this is an HTTP/2.0 line - "PRI * HTTP/2.0" - which indicates their payload will follow // we're going to immediately refuse this, im not interested in implementing http2 (it is unlikely // to bring me benefit) throw new HttpVersionNotSupportedException(); } requestMethod = to!RequestMethod(parts.front); parts.popFront(); requestUri = parts.front; // FIXME: the requestUri could be an absolute path!!! should I rename it or something? scriptName = requestUri[0 .. pathInfoStarts]; auto question = requestUri.indexOf("?"); if(question == -1) { queryString = ""; // FIXME: double check, this might be wrong since it could be url encoded pathInfo = requestUri[pathInfoStarts..$]; } else { queryString = requestUri[question+1..$]; pathInfo = requestUri[pathInfoStarts..question]; } auto ugh = decodeVariables(queryString, "&", &allGetNamesInOrder, &allGetValuesInOrder); getArray = cast(string[][string]) assumeUnique(ugh); if(header.indexOf("HTTP/1.0") != -1) { http10 = true; autoBuffer = true; if(closeConnection) { // on http 1.0, close is assumed (unlike http/1.1 where we assume keep alive) *closeConnection = true; } } } else { // other header auto colon = header.indexOf(":"); if(colon == -1) throw new Exception("HTTP headers should have a colon!"); string name = header[0..colon].toLower; string value = header[colon+2..$]; // skip the colon and the space requestHeadersHere[name] = value; if (name == "accept") { accept = value; } else if (name == "origin") { origin = value; } else if (name == "connection") { if(value == "close" && closeConnection) *closeConnection = true; if(value.asLowerCase().canFind("keep-alive")) { keepAliveRequested = true; // on http 1.0, the connection is closed by default, // but not if they request keep-alive. then we don't close // anymore - undoing the set above if(http10 && closeConnection) { *closeConnection = false; } } } else if (name == "transfer-encoding") { if(value == "chunked") isChunked = true; } else if (name == "last-event-id") { lastEventId = value; } else if (name == "authorization") { authorization = value; } else if (name == "content-type") { contentType = value; } else if (name == "content-length") { contentLength = to!size_t(value); } else if (name == "x-forwarded-for") { remoteAddress = value; } else if (name == "x-forwarded-host" || name == "host") { if(name != "host" || host is null) host = value; } // FIXME: https://tools.ietf.org/html/rfc7239 else if (name == "accept-encoding") { if(value.indexOf("gzip") != -1) acceptsGzip = true; } else if (name == "user-agent") { userAgent = value; } else if (name == "referer") { referrer = value; } else if (name == "cookie") { cookie ~= value; } else if(name == "expect") { if(value == "100-continue") { // FIXME we should probably give user code a chance // to process and reject but that needs to be virtual, // perhaps part of the CGI redesign. // FIXME: if size is > max content length it should // also fail at this point. _rawDataOutput(cast(ubyte[]) "HTTP/1.1 100 Continue\r\n\r\n"); // FIXME: let the user write out 103 early hints too } } // else // ignore it } } inputData.consume(idx + 4); // done requestHeaders = assumeUnique(requestHeadersHere); ByChunkRange dataByChunk; // reading Content-Length type data // We need to read up the data we have, and write it out as a chunk. if(!isChunked) { dataByChunk = byChunk(inputData, contentLength); } else { // chunked requests happen, but not every day. Since we need to know // the content length (for now, maybe that should change), we'll buffer // the whole thing here instead of parse streaming. (I think this is what Apache does anyway in cgi modes) auto data = dechunk(inputData); // set the range here dataByChunk = byChunk(data); contentLength = data.length; } assert(dataByChunk !is null); if(contentLength) { prepareForIncomingDataChunks(contentType, contentLength); foreach(dataChunk; dataByChunk) { handleIncomingDataChunk(dataChunk); } postArray = assumeUnique(pps._post); filesArray = assumeUnique(pps._files); files = keepLastOf(filesArray); post = keepLastOf(postArray); postBody = pps.postBody; cleanUpPostDataState(); } this.port = port; this.referrer = referrer; this.remoteAddress = remoteAddress; this.userAgent = userAgent; this.authorization = authorization; this.origin = origin; this.accept = accept; this.lastEventId = lastEventId; this.https = https; this.host = host; this.requestMethod = requestMethod; this.requestUri = requestUri; this.pathInfo = pathInfo; this.queryString = queryString; this.scriptName = scriptName; this.get = keepLastOf(getArray); this.getArray = cast(immutable) getArray; this.keepAliveRequested = keepAliveRequested; this.acceptsGzip = acceptsGzip; this.cookie = cookie; cookiesArray = getCookieArray(); cookies = keepLastOf(cookiesArray); } BufferedInputRange idlol; private immutable(string[string]) keepLastOf(in string[][string] arr) { string[string] ca; foreach(k, v; arr) ca[k] = v[$-1]; return assumeUnique(ca); } // FIXME duplication private immutable(UploadedFile[string]) keepLastOf(in UploadedFile[][string] arr) { UploadedFile[string] ca; foreach(k, v; arr) ca[k] = v[$-1]; return assumeUnique(ca); } private immutable(string[][string]) getCookieArray() { auto forTheLoveOfGod = decodeVariables(cookie, "; "); return assumeUnique(forTheLoveOfGod); } /// Very simple method to require a basic auth username and password. /// If the http request doesn't include the required credentials, it throws a /// HTTP 401 error, and an exception. /// /// Note: basic auth does not provide great security, especially over unencrypted HTTP; /// the user's credentials are sent in plain text on every request. /// /// If you are using Apache, the HTTP_AUTHORIZATION variable may not be sent to the /// application. Either use Apache's built in methods for basic authentication, or add /// something along these lines to your server configuration: /// /// RewriteEngine On /// RewriteCond %{HTTP:Authorization} ^(.*) /// RewriteRule ^(.*) - [E=HTTP_AUTHORIZATION:%1] /// /// To ensure the necessary data is available to cgi.d. void requireBasicAuth(string user, string pass, string message = null) { if(authorization != "Basic " ~ Base64.encode(cast(immutable(ubyte)[]) (user ~ ":" ~ pass))) { setResponseStatus("401 Authorization Required"); header ("WWW-Authenticate: Basic realm=\""~message~"\""); close(); throw new Exception("Not authorized; got " ~ authorization); } } /// Very simple caching controls - setCache(false) means it will never be cached. Good for rapidly updated or sensitive sites. /// setCache(true) means it will always be cached for as long as possible. Best for static content. /// Use setResponseExpires and updateResponseExpires for more control void setCache(bool allowCaching) { noCache = !allowCaching; } /// Set to true and use cgi.write(data, true); to send a gzipped response to browsers /// who can accept it bool gzipResponse; immutable bool acceptsGzip; immutable bool keepAliveRequested; /// Set to true if and only if this was initialized with command line arguments immutable bool isCalledWithCommandLineArguments; /// This gets a full url for the current request, including port, protocol, host, path, and query string getCurrentCompleteUri() const { ushort defaultPort = https ? 443 : 80; string uri = "http"; if(https) uri ~= "s"; uri ~= "://"; uri ~= host; /+ // the host has the port so p sure this never needed, cgi on apache and embedded http all do the right hting now version(none) if(!(!port || port == defaultPort)) { uri ~= ":"; uri ~= to!string(port); } +/ uri ~= requestUri; return uri; } /// You can override this if your site base url isn't the same as the script name string logicalScriptName() const { return scriptName; } /++ Sets the HTTP status of the response. For example, "404 File Not Found" or "500 Internal Server Error". It assumes "200 OK", and automatically changes to "302 Found" if you call setResponseLocation(). Note setResponseStatus() must be called *before* you write() any data to the output. History: The `int` overload was added on January 11, 2021. +/ void setResponseStatus(string status) { assert(!outputtedResponseData); responseStatus = status; } /// ditto void setResponseStatus(int statusCode) { setResponseStatus(getHttpCodeText(statusCode)); } private string responseStatus = null; /// Returns true if it is still possible to output headers bool canOutputHeaders() { return !isClosed && !outputtedResponseData; } /// Sets the location header, which the browser will redirect the user to automatically. /// Note setResponseLocation() must be called *before* you write() any data to the output. /// The optional important argument is used if it's a default suggestion rather than something to insist upon. void setResponseLocation(string uri, bool important = true, string status = null) { if(!important && isCurrentResponseLocationImportant) return; // important redirects always override unimportant ones if(uri is null) { responseStatus = "200 OK"; responseLocation = null; isCurrentResponseLocationImportant = important; return; // this just cancels the redirect } assert(!outputtedResponseData); if(status is null) responseStatus = "302 Found"; else responseStatus = status; responseLocation = uri.strip; isCurrentResponseLocationImportant = important; } protected string responseLocation = null; private bool isCurrentResponseLocationImportant = false; /// Sets the Expires: http header. See also: updateResponseExpires, setPublicCaching /// The parameter is in unix_timestamp * 1000. Try setResponseExpires(getUTCtime() + SOME AMOUNT) for normal use. /// Note: the when parameter is different than setCookie's expire parameter. void setResponseExpires(long when, bool isPublic = false) { responseExpires = when; setCache(true); // need to enable caching so the date has meaning responseIsPublic = isPublic; responseExpiresRelative = false; } /// Sets a cache-control max-age header for whenFromNow, in seconds. void setResponseExpiresRelative(int whenFromNow, bool isPublic = false) { responseExpires = whenFromNow; setCache(true); // need to enable caching so the date has meaning responseIsPublic = isPublic; responseExpiresRelative = true; } private long responseExpires = long.min; private bool responseIsPublic = false; private bool responseExpiresRelative = false; /// This is like setResponseExpires, but it can be called multiple times. The setting most in the past is the one kept. /// If you have multiple functions, they all might call updateResponseExpires about their own return value. The program /// output as a whole is as cacheable as the least cachable part in the chain. /// setCache(false) always overrides this - it is, by definition, the strictest anti-cache statement available. If your site outputs sensitive user data, you should probably call setCache(false) when you do, to ensure no other functions will cache the content, as it may be a privacy risk. /// Conversely, setting here overrides setCache(true), since any expiration date is in the past of infinity. void updateResponseExpires(long when, bool isPublic) { if(responseExpires == long.min) setResponseExpires(when, isPublic); else if(when < responseExpires) setResponseExpires(when, responseIsPublic && isPublic); // if any part of it is private, it all is } /* /// Set to true if you want the result to be cached publically - that is, is the content shared? /// Should generally be false if the user is logged in. It assumes private cache only. /// setCache(true) also turns on public caching, and setCache(false) sets to private. void setPublicCaching(bool allowPublicCaches) { publicCaching = allowPublicCaches; } private bool publicCaching = false; */ /++ History: Added January 11, 2021 +/ enum SameSitePolicy { Lax, Strict, None } /++ Sets an HTTP cookie, automatically encoding the data to the correct string. expiresIn is how many milliseconds in the future the cookie will expire. TIP: to make a cookie accessible from subdomains, set the domain to .yourdomain.com. Note setCookie() must be called *before* you write() any data to the output. History: Parameter `sameSitePolicy` was added on January 11, 2021. +/ void setCookie(string name, string data, long expiresIn = 0, string path = null, string domain = null, bool httpOnly = false, bool secure = false, SameSitePolicy sameSitePolicy = SameSitePolicy.Lax) { assert(!outputtedResponseData); string cookie = std.uri.encodeComponent(name) ~ "="; cookie ~= std.uri.encodeComponent(data); if(path !is null) cookie ~= "; path=" ~ path; // FIXME: should I just be using max-age here? (also in cache below) if(expiresIn != 0) cookie ~= "; expires=" ~ printDate(cast(DateTime) Clock.currTime(UTC()) + dur!"msecs"(expiresIn)); if(domain !is null) cookie ~= "; domain=" ~ domain; if(secure == true) cookie ~= "; Secure"; if(httpOnly == true ) cookie ~= "; HttpOnly"; final switch(sameSitePolicy) { case SameSitePolicy.Lax: cookie ~= "; SameSite=Lax"; break; case SameSitePolicy.Strict: cookie ~= "; SameSite=Strict"; break; case SameSitePolicy.None: cookie ~= "; SameSite=None"; assert(secure); // cookie spec requires this now, see: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie/SameSite break; } if(auto idx = name in cookieIndexes) { responseCookies[*idx] = cookie; } else { cookieIndexes[name] = responseCookies.length; responseCookies ~= cookie; } } private string[] responseCookies; private size_t[string] cookieIndexes; /// Clears a previously set cookie with the given name, path, and domain. void clearCookie(string name, string path = null, string domain = null) { assert(!outputtedResponseData); setCookie(name, "", 1, path, domain); } /// Sets the content type of the response, for example "text/html" (the default) for HTML, or "image/png" for a PNG image void setResponseContentType(string ct) { assert(!outputtedResponseData); responseContentType = ct; } private string responseContentType = null; /// Adds a custom header. It should be the name: value, but without any line terminator. /// For example: header("X-My-Header: Some value"); /// Note you should use the specialized functions in this object if possible to avoid /// duplicates in the output. void header(string h) { customHeaders ~= h; } /++ I named the original function `header` after PHP, but this pattern more fits the rest of the Cgi object. Either name are allowed. History: Alias added June 17, 2022. +/ alias setResponseHeader = header; private string[] customHeaders; private bool websocketMode; void flushHeaders(const(void)[] t, bool isAll = false) { StackBuffer buffer = StackBuffer(0); prepHeaders(t, isAll, &buffer); if(rawDataOutput !is null) rawDataOutput(cast(const(ubyte)[]) buffer.get()); else { stdout.rawWrite(buffer.get()); } } private void prepHeaders(const(void)[] t, bool isAll, StackBuffer* buffer) { string terminator = "\n"; if(rawDataOutput !is null) terminator = "\r\n"; if(responseStatus !is null) { if(nph) { if(http10) buffer.add("HTTP/1.0 ", responseStatus, terminator); else buffer.add("HTTP/1.1 ", responseStatus, terminator); } else buffer.add("Status: ", responseStatus, terminator); } else if (nph) { if(http10) buffer.add("HTTP/1.0 200 OK", terminator); else buffer.add("HTTP/1.1 200 OK", terminator); } if(websocketMode) goto websocket; if(nph) { // we're responsible for setting the date too according to http 1.1 char[29] db = void; printDateToBuffer(cast(DateTime) Clock.currTime(UTC()), db[]); buffer.add("Date: ", db[], terminator); } // FIXME: what if the user wants to set his own content-length? // The custom header function can do it, so maybe that's best. // Or we could reuse the isAll param. if(responseLocation !is null) { buffer.add("Location: ", responseLocation, terminator); } if(!noCache && responseExpires != long.min) { // an explicit expiration date is set if(responseExpiresRelative) { buffer.add("Cache-Control: ", responseIsPublic ? "public" : "private", ", max-age="); buffer.add(responseExpires); buffer.add(", no-cache=\"set-cookie, set-cookie2\"", terminator); } else { auto expires = SysTime(unixTimeToStdTime(cast(int)(responseExpires / 1000)), UTC()); char[29] db = void; printDateToBuffer(cast(DateTime) expires, db[]); buffer.add("Expires: ", db[], terminator); // FIXME: assuming everything is private unless you use nocache - generally right for dynamic pages, but not necessarily buffer.add("Cache-Control: ", (responseIsPublic ? "public" : "private"), ", no-cache=\"set-cookie, set-cookie2\""); buffer.add(terminator); } } if(responseCookies !is null && responseCookies.length > 0) { foreach(c; responseCookies) buffer.add("Set-Cookie: ", c, terminator); } if(noCache) { // we specifically do not want caching (this is actually the default) buffer.add("Cache-Control: private, no-cache=\"set-cookie\"", terminator); buffer.add("Expires: 0", terminator); buffer.add("Pragma: no-cache", terminator); } else { if(responseExpires == long.min) { // caching was enabled, but without a date set - that means assume cache forever buffer.add("Cache-Control: public", terminator); buffer.add("Expires: Tue, 31 Dec 2030 14:00:00 GMT", terminator); // FIXME: should not be more than one year in the future } } if(responseContentType !is null) { buffer.add("Content-Type: ", responseContentType, terminator); } else buffer.add("Content-Type: text/html; charset=utf-8", terminator); if(gzipResponse && acceptsGzip && isAll) { // FIXME: isAll really shouldn't be necessary buffer.add("Content-Encoding: gzip", terminator); } if(!isAll) { if(nph && !http10) { buffer.add("Transfer-Encoding: chunked", terminator); responseChunked = true; } } else { buffer.add("Content-Length: "); buffer.add(t.length); buffer.add(terminator); if(nph && keepAliveRequested) { buffer.add("Connection: Keep-Alive", terminator); } } websocket: foreach(hd; customHeaders) buffer.add(hd, terminator); // FIXME: what about duplicated headers? // end of header indicator buffer.add(terminator); outputtedResponseData = true; } /// Writes the data to the output, flushing headers if they have not yet been sent. void write(const(void)[] t, bool isAll = false, bool maybeAutoClose = true) { assert(!closed, "Output has already been closed"); StackBuffer buffer = StackBuffer(0); if(gzipResponse && acceptsGzip && isAll) { // FIXME: isAll really shouldn't be necessary // actually gzip the data here auto c = new Compress(HeaderFormat.gzip); // want gzip auto data = c.compress(t); data ~= c.flush(); // std.file.write("/tmp/last-item", data); t = data; } if(!outputtedResponseData && (!autoBuffer || isAll)) { prepHeaders(t, isAll, &buffer); } if(requestMethod != RequestMethod.HEAD && t.length > 0) { if (autoBuffer && !isAll) { outputBuffer ~= cast(ubyte[]) t; } if(!autoBuffer || isAll) { if(rawDataOutput !is null) if(nph && responseChunked) { //rawDataOutput(makeChunk(cast(const(ubyte)[]) t)); // we're making the chunk here instead of in a function // to avoid unneeded gc pressure buffer.add(toHex(t.length)); buffer.add("\r\n"); buffer.add(cast(char[]) t, "\r\n"); } else { buffer.add(cast(char[]) t); } else buffer.add(cast(char[]) t); } } if(rawDataOutput !is null) rawDataOutput(cast(const(ubyte)[]) buffer.get()); else stdout.rawWrite(buffer.get()); if(maybeAutoClose && isAll) close(); // if you say it is all, that means we're definitely done // maybeAutoClose can be false though to avoid this (important if you call from inside close()! } /++ Convenience method to set content type to json and write the string as the complete response. History: Added January 16, 2020 +/ void writeJson(string json) { this.setResponseContentType("application/json"); this.write(json, true); } /// Flushes the pending buffer, leaving the connection open so you can send more. void flush() { if(rawDataOutput is null) stdout.flush(); else if(flushDelegate !is null) flushDelegate(); } version(autoBuffer) bool autoBuffer = true; else bool autoBuffer = false; ubyte[] outputBuffer; /// Flushes the buffers to the network, signifying that you are done. /// You should always call this explicitly when you are done outputting data. void close() { if(closed) return; // don't double close if(!outputtedResponseData) write("", true, false); // writing auto buffered data if(requestMethod != RequestMethod.HEAD && autoBuffer) { if(!nph) stdout.rawWrite(outputBuffer); else write(outputBuffer, true, false); // tell it this is everything } // closing the last chunk... if(nph && rawDataOutput !is null && responseChunked) rawDataOutput(cast(const(ubyte)[]) "0\r\n\r\n"); if(flushDelegate) flushDelegate(); closed = true; } // Closes without doing anything, shouldn't be used often void rawClose() { closed = true; } /++ Gets a request variable as a specific type, or the default value of it isn't there or isn't convertible to the request type. Checks both GET and POST variables, preferring the POST variable, if available. A nice trick is using the default value to choose the type: --- /* The return value will match the type of the default. Here, I gave 10 as a default, so the return value will be an int. If the user-supplied value cannot be converted to the requested type, you will get the default value back. */ int a = cgi.request("number", 10); if(cgi.get["number"] == "11") assert(a == 11); // conversion succeeds if("number" !in cgi.get) assert(a == 10); // no value means you can't convert - give the default if(cgi.get["number"] == "twelve") assert(a == 10); // conversion from string to int would fail, so we get the default --- You can use an enum as an easy whitelist, too: --- enum Operations { add, remove, query } auto op = cgi.request("op", Operations.query); if(cgi.get["op"] == "add") assert(op == Operations.add); if(cgi.get["op"] == "remove") assert(op == Operations.remove); if(cgi.get["op"] == "query") assert(op == Operations.query); if(cgi.get["op"] == "random string") assert(op == Operations.query); // the value can't be converted to the enum, so we get the default --- +/ T request(T = string)(in string name, in T def = T.init) const nothrow { try { return (name in post) ? to!T(post[name]) : (name in get) ? to!T(get[name]) : def; } catch(Exception e) { return def; } } /// Is the output already closed? bool isClosed() const { return closed; } /++ Gets a session object associated with the `cgi` request. You can use different type throughout your application. +/ Session!Data getSessionObject(Data)() { if(testInProcess !is null) { // test mode auto obj = testInProcess.getSessionOverride(typeid(typeof(return))); if(obj !is null) return cast(typeof(return)) obj; else { auto o = new MockSession!Data(); testInProcess.setSessionOverride(typeid(typeof(return)), o); return o; } } else { // normal operation return new BasicDataServerSession!Data(this); } } // if it is in test mode; triggers mock sessions. Used by CgiTester version(with_breaking_cgi_features) private CgiTester testInProcess; /* Hooks for redirecting input and output */ private void delegate(const(ubyte)[]) rawDataOutput = null; private void delegate() flushDelegate = null; /* This info is used when handling a more raw HTTP protocol */ private bool nph; private bool http10; private bool closed; private bool responseChunked = false; version(preserveData) // note: this can eat lots of memory; don't use unless you're sure you need it. immutable(ubyte)[] originalPostData; /++ This holds the posted body data if it has not been parsed into [post] and [postArray]. It is intended to be used for JSON and XML request content types, but also may be used for other content types your application can handle. But it will NOT be populated for content types application/x-www-form-urlencoded or multipart/form-data, since those are parsed into the post and postArray members. Remember that anything beyond your `maxContentLength` param when setting up [GenericMain], etc., will be discarded to the client with an error. This helps keep this array from being exploded in size and consuming all your server's memory (though it may still be possible to eat excess ram from a concurrent client in certain build modes.) History: Added January 5, 2021 Documented February 21, 2023 (dub v11.0) +/ public immutable string postBody; alias postJson = postBody; // old name /* Internal state flags */ private bool outputtedResponseData; private bool noCache = true; const(string[string]) environmentVariables; /** What follows is data gotten from the HTTP request. It is all fully immutable, partially because it logically is (your code doesn't change what the user requested...) and partially because I hate how bad programs in PHP change those superglobals to do all kinds of hard to follow ugliness. I don't want that to ever happen in D. For some of these, you'll want to refer to the http or cgi specs for more details. */ immutable(string[string]) requestHeaders; /// All the raw headers in the request as name/value pairs. The name is stored as all lower case, but otherwise the same as it is in HTTP; words separated by dashes. For example, "cookie" or "accept-encoding". Many HTTP headers have specialized variables below for more convenience and static name checking; you should generally try to use them. immutable(char[]) host; /// The hostname in the request. If one program serves multiple domains, you can use this to differentiate between them. immutable(char[]) origin; /// The origin header in the request, if present. Some HTML5 cross-domain apis set this and you should check it on those cross domain requests and websockets. immutable(char[]) userAgent; /// The browser's user-agent string. Can be used to identify the browser. immutable(char[]) pathInfo; /// This is any stuff sent after your program's name on the url, but before the query string. For example, suppose your program is named "app". If the user goes to site.com/app, pathInfo is empty. But, he can also go to site.com/app/some/sub/path; treating your program like a virtual folder. In this case, pathInfo == "/some/sub/path". immutable(char[]) scriptName; /// The full base path of your program, as seen by the user. If your program is located at site.com/programs/apps, scriptName == "/programs/apps". immutable(char[]) scriptFileName; /// The physical filename of your script immutable(char[]) authorization; /// The full authorization string from the header, undigested. Useful for implementing auth schemes such as OAuth 1.0. Note that some web servers do not forward this to the app without taking extra steps. See requireBasicAuth's comment for more info. immutable(char[]) accept; /// The HTTP accept header is the user agent telling what content types it is willing to accept. This is often */*; they accept everything, so it's not terribly useful. (The similar sounding Accept-Encoding header is handled automatically for chunking and gzipping. Simply set gzipResponse = true and cgi.d handles the details, zipping if the user's browser is willing to accept it.) immutable(char[]) lastEventId; /// The HTML 5 draft includes an EventSource() object that connects to the server, and remains open to take a stream of events. My arsd.rtud module can help with the server side part of that. The Last-Event-Id http header is defined in the draft to help handle loss of connection. When the browser reconnects to you, it sets this header to the last event id it saw, so you can catch it up. This member has the contents of that header. immutable(RequestMethod) requestMethod; /// The HTTP request verb: GET, POST, etc. It is represented as an enum in cgi.d (which, like many enums, you can convert back to string with std.conv.to()). A HTTP GET is supposed to, according to the spec, not have side effects; a user can GET something over and over again and always have the same result. On all requests, the get[] and getArray[] members may be filled in. The post[] and postArray[] members are only filled in on POST methods. immutable(char[]) queryString; /// The unparsed content of the request query string - the stuff after the ? in your URL. See get[] and getArray[] for a parse view of it. Sometimes, the unparsed string is useful though if you want a custom format of data up there (probably not a good idea, unless it is really simple, like "?username" perhaps.) immutable(char[]) cookie; /// The unparsed content of the Cookie: header in the request. See also the cookies[string] member for a parsed view of the data. /** The Referer header from the request. (It is misspelled in the HTTP spec, and thus the actual request and cgi specs too, but I spelled the word correctly here because that's sane. The spec's misspelling is an implementation detail.) It contains the site url that referred the user to your program; the site that linked to you, or if you're serving images, the site that has you as an image. Also, if you're in an iframe, the referrer is the site that is framing you. Important note: if the user copy/pastes your url, this is blank, and, just like with all other user data, their browsers can also lie to you. Don't rely on it for real security. */ immutable(char[]) referrer; immutable(char[]) requestUri; /// The full url if the current request, excluding the protocol and host. requestUri == scriptName ~ pathInfo ~ (queryString.length ? "?" ~ queryString : ""); immutable(char[]) remoteAddress; /// The IP address of the user, as we see it. (Might not match the IP of the user's computer due to things like proxies and NAT.) immutable bool https; /// Was the request encrypted via https? immutable int port; /// On what TCP port number did the server receive the request? /** Here come the parsed request variables - the things that come close to PHP's _GET, _POST, etc. superglobals in content. */ immutable(string[string]) get; /// The data from your query string in the url, only showing the last string of each name. If you want to handle multiple values with the same name, use getArray. This only works right if the query string is x-www-form-urlencoded; the default you see on the web with name=value pairs separated by the & character. immutable(string[string]) post; /// The data from the request's body, on POST requests. It parses application/x-www-form-urlencoded data (used by most web requests, including typical forms), and multipart/form-data requests (used by file uploads on web forms) into the same container, so you can always access them the same way. It makes no attempt to parse other content types. If you want to accept an XML Post body (for a web api perhaps), you'll need to handle the raw data yourself. immutable(string[string]) cookies; /// Separates out the cookie header into individual name/value pairs (which is how you set them!) /** Represents user uploaded files. When making a file upload form, be sure to follow the standard: set method="POST" and enctype="multipart/form-data" in your html