We all migrate to smaller websites try not to post outside drawing attention just to hide from the “Ai” crawlers. The internet seems dead except for the few pockets we each know existed away from the clankers

  • dual_sport_dork 🐧🗡️@lemmy.world
    link
    fedilink
    English
    arrow-up
    87
    ·
    21 hours ago

    That’s because it’s numerically possible to sweep through the entire IPv4 address range fairly trivially, especially if you do it in parallel with some kind of botnet, proverbially jiggling the digital door handles of every server in the world to see if any of them happen to be unlocked.

    One wonders if switching to purely IPv6 will forestall this somewhat, as the number space is multiple orders of magnitude larger. That’s only security through obscurity, though, and it’s certain the bots will still find you eventually. Plus, if you have a doman name the attackers already know where you are — they can just look up your DNS record, which is what DNS records are for.

    • SkyeStarfall@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      2
      ·
      5 hours ago

      It’s not as simple as “only security through obscurity”. You could say the same thing for an encryption key of a certain length. The private key to a public key is still technically just an obscurity, but it’s still impractical to actually go through the entire range

      IPv6 is big enough where this obscurity becomes impractical to sweep. But of course, as you said, there may be other methods of finding your address

    • kazaika@lemmy.world
      link
      fedilink
      arrow-up
      6
      ·
      18 hours ago

      Servers which are meant to be secure usually are configured to not react to pings and do not give out failure responses to unauthenticated requests. This should be viable for a authenticated only walled garden type website op is suggesting, no?

      • Cooper8@feddit.online
        link
        fedilink
        English
        arrow-up
        2
        ·
        13 hours ago

        I have suggested a couple of times now that ActivityPub should implement an encryption layer for user authentication of requests and pings. It already has a system for instances vauching for each other. The situation is that users of “walled garden” instances in ActivityPub lack means of interfacing with public facing instances that doesnt leave the network open for scraping. I believe a pivot towards default registered users only content service built on encrypted handshakes, with the ability for servers to opt-in to serving content to unregistered users would make the whole network much more robust and less dependent on third party contingencies like CloudFlare.

        Then again, maybe I should just be looking for a different network, I’m sure there are services in the blockchain/cryptosphere that take that approach, I just would rather participate in a network built on commons rather than financialization at it’s core. Where is the protocol doing both hardened network and distributed volunteer instances?

      • dual_sport_dork 🐧🗡️@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        14 hours ago

        There are several things you could do in that regard, I’m sure. Configure your services to listen only on weird ports, disable ICMP pings, jigger your scripts to return timeouts instead of error messages… Many of which might make your own life difficult, as well.

        All of these are also completely counterproductive if you want your hosted service, whatever it is, to be accessible to others. Or maybe not, if you don’t. The point is, the bots don’t have to find every single web service and site with 100% accuracy. The hackers only have to get lucky once and stumble their way into e.g. someone’s unsecured web host where they can push more malware, or a pile of files they can encrypt and demand a ransom, or personal information they can steal, or content they can scrape with their dumb AI, or whatever. But they can keep on trying until the sun burns out basically for free, and you have to stay lucky and under the radar forever.

        In my case just to name an example I kind of need my site to be accessible to the public at large if I want to, er, actually make any sales.

    • kossa@feddit.org
      link
      fedilink
      arrow-up
      9
      ·
      20 hours ago

      But an IP can have multiple websites and even not return anything on plain IP access. How do crawlers find out about domains and unlinked subdomains? Do they even?

        • taaz@biglemmowski.win
          link
          fedilink
          arrow-up
          10
          ·
          edit-2
          18 hours ago

          thinking about this, wouldn’t the best way to hide a modern websie be something along getting a wildcard domain cert (can be done with LE with DNS challenge), cnaming the wildcard to the root domain and then hosting the website on a random subdomain string ? am I missing something

          • confusedpuppy@lemmy.dbzer0.com
            link
            fedilink
            arrow-up
            7
            ·
            16 hours ago

            I do something something like this using wildcard certs with Let’s Encrypt. Except I go one step further because my ISP blocks incoming data on common ports so I end up using an uncommon port as well.

            I’m not hosting anything important and I don’t need to always access to it, it’s mostly just for fun for myself.

            Accessing my site ends up looking like https://randomsubdomain.registered-domain-name.com:4444/

            My logs only ever show my own activity. I’m sure there are downsides to using uncommon ports but I mitigate that by adjusting my personal life to not caring about being connected to my stuff at all times.

            I get to have my little hobby in my own corner of the internet without the worry of bots or AI.

      • simeon@reddthat.com
        link
        fedilink
        arrow-up
        3
        ·
        20 hours ago

        Every SSL certificate is publicly logged(you can see these logs e. g. under crt.sh) and you might be able to read DNS records to find new (sub)domains. The modern internet is too focused on being discoverable and transparent to make hiding an entire service(domain + servers) feasible. But things like example.com/dhusvsuahavag8wjwhsusiajaosbsh are entirely unfindable as long as they are not linked to

        • kossa@feddit.org
          link
          fedilink
          arrow-up
          4
          ·
          11 hours ago

          Random subdomain on wildcard certificate, IP written in the host file to mitigate DNS records, only given by word-to-mouth 😅.

          Nobody said the uncrawled dark forest would be comfortable.

    • lauha@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      16 hours ago

      I love your “multiple orders of magnitude”. I don’t think you appreciate or realise how much larger ipv6 address space is :)

      • dual_sport_dork 🐧🗡️@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        15 hours ago

        I wasn’t going to type that many commas for the sake of brevity, but it’s 340,282,366,920,938,463,463,374,607,431,768,211,456 possible addresses. I.e. 2128. So yes, I do.

        I consider 96 orders (in binary, anyway) as “multiple.” Wouldn’t you?

        • lauha@lemmy.world
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          7 hours ago

          No need to be defensive. I’m not insulting, I just find it funny :) usually people call that “dozens”. But dozens of orders of magnitude really doesn’t give the sense of scale.

          You could have 8 billion in habitants in every 10^24 stars in the universe and everyone could still have 42k addresses.