• 0 Posts
  • 60 Comments
Joined 1 year ago
cake
Cake day: June 25th, 2023

help-circle





  • Python 2 had one mostly-working str class, and a mostly-broken unicode class.

    Python 3, for some reason, got rid of the one that mostly worked, leaving no replacement. The closest you can get is to spam surrogateescape everywhere, which is both incorrect and has significant performance cost - and that still leaves several APIs unavailable.

    Simply removing str indexing would’ve fixed the common user mistake if that was really desirable. It’s not like unicode indexing is meaningful either, and now large amounts of historical data can no longer be accessed from Python.



  • Unfortunately both of those are used in common English or computer words. The only letter pairs not used are: bq, bx, cf, cj, dx, fq, fx, fz, hx, jb, jc, jf, jg, jq, jv, jx, jz, kq, kz, mx, px, qc, qd, qg, qh, qj, qk, ql, qm, qn, qp, qq, qr, qt, qv, qx, qy, qz, sx, tx, vb, vc, vf, vj, vm, vq, vw, vx, wq, wx, xj, zx.

    Personally I have mappings based on <CR>, and press it twice to get a real newline.



    1. Sure, there are libraries and tools for some parts. But the question is: do they actually add value, or do they subtract it due to the cost it imposes to integrate them? There are only a couple parts where it is likely worth it (but even there, you’ll need to understand what they’re doing): the UDP reliability layer, the encryption layer, and possibly the event loop (but I say that mostly because io_uring is weird; previously I wouldn’t include this. On the client side it isn’t needed, but that depends on how much code you share between server and client). Packet framing and serialization are really easy to do yourself and most existing tools (which usually do generate code anyway) have weird limitations or overhead.
    2. “should do more” is a general rule to prevent easy DoS attacks. It can mean packet sizes, it can mean computation, it can mean hard-coded timer delays, it can mean all sorts of things. Don’t make it easy for a malicious client to waste shared resources. This is mostly relevant for early in the connection … but keep in mind that it’s possible for a malicious client to spy on a legitimate client and try to take over - that is, connections aren’t actually a real thing (this applies even for TCP).

  • You’ve clearly thought about the problem, so the solutions should be relatively obvious. Some less obvious ones:

    • It is impossible to make TCP reliable no matter how hard you try, because anybody can inject an RST flag at any time and cut off your connections (this isn’t theoretical, it’s actually quite common for long-lived gaming connections). That leaves UDP, for which there are several reliability layers, but most of them are not battle-tested - remember, TCP is most notable for congestion-control! HTTP3 is probably the only viable choice at scale, but beware that many implementations are very bad (e.g. not even supporting recvmmsg/sendmmsg which are critical for performance unlike with TCP; note the extra m)
    • If you don’t encrypt all your packets, you will have random middleware mess with their data. Think at least a little about key rotation.
    • To avoid application-centric DoS, make sure the client always does “more” than the server; this extends to e.g. packet sizes.
    • Prefer to ultimately define things in data, not code (e.g. network packet layouts). Don’t be afraid to write several bespoke code-generators; many real-world serialization formats in particular have unacceptable tradeoffs. Make sure the core code doesn’t care about the details (e.g. make every packet physically variable-length even if logically it is always fixed-length; you can also normalize zero-padding at this level for future compatibility. I advise against delta-compression at this level because that’s extra processing you don’t need).
    • Make sure the client only has to connect to a single server. If you have multiple servers internally, have a thin bouncer/proxy that forwards packets appropriately. This also has benefits for the inevitable DDoS attacks.
    • Latency is a bitch and has far-ranging effects, though this is highly dependent on not just genre but also UI. For example “hold down a key to move continuously through the world” is problematic whereas “click to move to a location” is not.
    • Beware quadratic complexity, e.g. if every player must send a location update to every player.
    • Think not only about the database, but how to back up the database and how to roll back in case of catastrophe or exploit. An append-only flat file has a lot going for it; only periodic repacking is needed and you can keep the old version for a while with a guarantee that it’ll replay to identical state to the initial version of the new file. Of course, the best state is no state at all. You will need to consider the notion of “transaction” at many levels, including scripting (you must give me 20 bear asses for me to give), trading between players, etc.
    • You will have abuse in chat. You will also have cybersex. It’s possible to deal with this in a privacy-preserving way by merely signing chat, not logging it, so the player can present evidence only if they wish, but there are a lot of concerns about e.g. replays, selective message subsets, etc.
    • There will be bots, especially if the official client isn’t good enough.
    • It’s $CURRENTYEAR; write code for IPv6 exclusively. There are sockopts for transparently handling legacy IPv4 clients.
    • Client IP address is private information. It is also the only way to deal with certain kinds of abuse. Sometimes, you just have to block all of Poland.
    • Note that routing in parts of the world is really bad. Sometimes setting up your own dedicated connection chain between datacenters can improve performance by orders of magnitude, rather than letting clients use whatever their ISP says. If nesting proxies be sure to correctly validate IPs.
    • Life is simpler if internal stuff listens on a separate port than external stuff, but still verify your peer. IP whitelisting is useless except for localhost (which, mind, is all of 127.0.0.0/8 for IPv4 - about the only time IPv4 is actually useful rather than a mere mirage).


  • Speed is far from the only thing that matters in terminal emulators though. Correctness is critical.

    The only terminals in which I have any confidence of correctness are xterm and pangoterm. And I suppose technically the BEL-for-ST extension is incorrect even there, but we have to live with that and a workaround is available.

    A lot of terminal emulators end up hard-coding a handful of common sequences, and fail to correctly ignore sequences they don’t implement. And worse, many go on to implement sequences that cannot be correctly handled.

    One simple example that usually fails: \e!!F. More nasty, however, are the ones that ignore intermediaries and execute some unrelated command instead.

    I can’t be bothered to pick apart specific terminals anymore. Most don’t even know what an IR is.


  • I guess I forgot to mention the other implicit difference in concerns:

    When you are a game, you can reasonably assume: I have the user’s full focus and can take all the computing resources of their device, barring a few background apps.

    When you are an application, the user will almost always have several other applications running to a meaningful degree, and those eat into available resources (often in a difficult-to-measure way). Unfortunately this rarely gets tested.

    I’m not saying you can’t write an app using a game toolkit or vice versa, but you have to be aware of the differences and figure out how to configure it correctly for your use case.

    (though actually - some purely-turn-based games that do nothing until user enters input do just fine on app toolkits. But the existence of such games means that game toolkits almost always support some way of supporting the app paradigm. By contrast, app toolkits often lack ready support for continuous game paradigms … unless you use APIs designed for video playback, often involving creating a separate child “window”. Actual video playback is really hard; even the makers of dedicated video-playing programs mess it up.)


  • There’s tends to be one major difference between games and non-game applications, so toolkits designed for one are often quite unsuitable for the other.

    A game generally performs logic to paint the whole window, every frame, with at most some framerate-limiting in “paused” states. This burns power but is steady and often tries hard to reduce latency.

    An application generally tries to paint as little of the window as possible, as rarely as possible. Reducing video bandwidth means using a lot less power, but can involve variable loads so sometimes latency gets pushed down to “it would be nice”.

    Notably, the implications of the 4-way choice between {tearing, vsync, double-buffer, triple-buffer} looks very different between those two - and so does the question of “how do we use the GPU”?



  • Even logging can sometimes be enough to hide the heisgenbug.

    Logging to a file descriptor can sometimes be avoided by logging to memory (which for crash-safety includes the possibility of an mmap’ed file, since the kernel will just take care of them as long as the whole system doesn’t go down). But logging from every thread to a single section of memory can also be problematic (even without mutexes, atomics can be expensive and certainly have side-effects) - sometimes you need a separate per-thread log, and combine in the log-reader tool.