[ yn / yndd / fg / yume ] [ o / lit / media / og / ig / 2 ] [ ot / cc / x / sugg ] [ hikki / rec ] [ news / rules / faq / recent / annex / manage ] [ discord / matrix / scans / mud / minecraft / usagi ] [ sushigirl / lewd.sx / lainzine ]

/sugg/ - Suggestions / Meta

Site meta-talk, help, suggestions, and moderation discussion
[catalog]

Name
Email
Subject
Comment
File
Password (For file deletion.)

The new CP spam filter now also works on posts that hide the link in the image instead of the post body.

File: 1446334933252.png (347.64 KB, 566x800, c28d40feaff4d394c7d519bcbc….png)

 No.2484

So, upon further investigation, it turns out that the spam bots are using countermeasures to thwart the less intensive anti-spam measures we could use. Unfortunately this means that there is no easy solution. Currently I can see three possible remedies.

1) Use captchas. (Once-a-day captchas are a feature that is specific to 8chan and I don't know if we can implement them.)

2) Use an automated spam-detection service like Akismet. The con here is that we'd have to share every post's content with Akismet's servers in order to check it for spam, including the user's IP address.

3) Hire a ridiculous number of Janitors and try to manage them all.

Word/phrase filters are impossible because the bots are too smart. Any comments or alternative suggestions are welcome.

 No.2485

>>2484
Janitors or captcha, lots of people can get spooked by sharing their IP.

 No.2486

I'm going to try one more thing real quick and see if it helps.

 No.2487

>>2486
what are you testing?

 No.2488

>>2487
I'm going to try using another DNS Blocklist service, it checks the poster's IP only (without sharing post contents) against a list of known bot IPs. We already were using two blocklists but neither of them were catching these particular bots. I don't have a good feeling about it working though.

 No.2489

>>2488
yeah, favouring toward captcha now, but still need a janitor all the time

 No.2490

>>2489
I'm worried that enabling a captcha will immediately and permanently kill the entire site.

 No.2491

>>2490
How would that happen?

 No.2492

>>2491
People would be too lazy to fill out the captcha, throw a bitch fit / complain, or possibly lose their entire post if they accidentally fill it out wrong. Maybe the place would be nicer (albeit slower) if all those people left though.

 No.2493

>>2492
sounds like very minor problems to me but they'd complain a lot more from Akismet, but what are the problems of hiring a bunch of janitors? Can guess a few problems but don't know much about this stuff

 No.2494

>>2493
The biggest problem is there's no way to record the contents of posts that have been deleted. If a corrupt Janitor is deleting posts they dislike and saying they were spam posts, there is literally no way to prove them wrong. That's why I try to be careful about who I hire.

 No.2495

File: 1446337982804.gif (354.6 KB, 480x359, 1368255580169.gif)

Me for janitor.

 No.2496

>>2494
Yeah, that was one of my guesses, but how many people do you think you could trust with being a janitor? Gonna be problematic if you don't know a lot of people

 No.2497

>>2496
I could try just picking users I trust instead of holding applications.

 No.2498

File: 1446338563911.jpg (31.12 KB, 400x410, image.jpg)

>>2497
I-I have a lot of free time and I'd be glad to help, guessing you wouldn't want to hire me this quick t-though

 No.2500

(Sorry if my posts keep changing, I update them as I'm doing quick fact checks to make sure I'm not talking out of my ass or getting something wrong.)

Here's another thing, I prefer to only hire IRC users, because it allows me to communicate with the staff in real time. Sometimes people want to be staff but refuse to participate in the IRC community, which is an instant disqualification.

In fact, during the last Janitor applications, anyone who wasn't already an IRC user or who I hadn't seen on IRC frequently was disqualified, since I found it unlikely that I would be able to keep in contact with them on a regular basis.

 No.2501

>>2498
You're damn right you stuttering loser weeb.

 No.2502

File: 1446339016302.jpg (95.63 KB, 1280x720, image.jpg)

>>2500
I'm in the IRC sometimes and going on more frequently now
>>2501
hey no bully

 No.2503

This is not a Janitor application thread, let's please get back on topic.

 No.2504

>>2503
alright, sorry, I'd say go for the captcha, but try to find a way that they shouldn't re-type their post if they didn't get the captcha right, pretty sure there's not a lot of people who'd give up on posting just for having to type a blurry word

 No.2505

>>2504
Yeah, you're probably right. I'll try it if the new blocklist doesn't work.

 No.2507

File: 1446341071978.jpg (23.58 KB, 480x360, scioli2.jpg)

>Word/phrase filters are impossible because the bots are too smart.
I'm pretty sure they always post almost the same links, albeit with a different pre/sufix each time. Do we have some kind of register? It'd be worth to give it a shot, although I'm gonna assume the links are really different and this isn't gonna work. In that case, I'd say that captchas are the best option, while also trying to hire new janitors. So far nobody has really complained about this proposal, plus we have already tried this once and it was well received by the community.
Now, it didn't work last time, are we sure it's gonna do something?

About the blocklist, well, we already had two, one more won't harm anyone.

 No.2508

File: 1446401066606.jpg (84.51 KB, 650x780, 1403307186832.jpg)

>1) Use captchas. (Once-a-day captchas are a feature that is specific to 8chan and I don't know if we can implement them.)
As long as it's not google captcha I'm fine w/ this

 No.2509

If all else fails, then yeah i'd be okay with captchas too. just not something annoying. The ones that are just numbers on a house or something are simple and easy.

 No.2510

>>2508
If we go the captcha route it might have to be Google captcha. Some bots are capable of solving simpler captchas these days. Of course we could test with another captcha to start with and see if it keeps them out.

 No.2511

>>2510
Oops, ReCaptcha is the only captcha that Vichan supports apparently :/

 No.2512

Why are Google captchas bad?

 No.2513

>>2512
Good question. Can anyone give me a reason why I shouldn't enable ReCaptcha?

 No.2514

File: 1446421836771.jpg (43.86 KB, 600x336, image.jpg)

>>2513
Nope.

 No.2515

>>2510
>>2511
>>2512
>>2513
Because google is a monster, I don't want to be datamined when I post on imageboards. Google captcha was one of the reasons why I left 4chan

 No.2516

>>2515
The alternative is literally infinite CP spam.

 No.2517

>>2515
>Google captcha was one of the reasons why I left 4chan
Go home Stallman, you're drunk.

 No.2518

>>2494
>The biggest problem is there's no way to record the contents of posts that have been deleted. If a corrupt Janitor is deleting posts they dislike and saying they were spam posts, there is literally no way to prove them wrong.
>there's no way to record the contents of posts that have been deleted.
This sounds like a desirable feature, some sort of logging system that keeps a temporary hidden store of deleted posts that management can check to keep jannies accountable for what they delete.

 No.2519

>>2515
I have to agree with this anon, as someone who's quite conscious about privacy, Google is a surveillance data-gathering behemoth to be avoided at all costs.

>>2511
Have you tried contacting czaks to see if he can add 8chan's captcha system to vichan or something? I think he was involved in its development ( related: https://github.com/vichan-devel/vichan/issues/140#issuecomment-94216050 )
You can find him on irc #vichan @irc.6irc.net ( https://webchat.6irc.net/?channels=vichan )

It seems to work at stopping spambots on 8chan, although as with all anti-spam measures it's a constant arms race so the spammers may one day break it with OCR or something, other new imageboards like Infinity Next are developing their own captcha systems ( https://github.com/infinity-next/infinity-next https://infinitydev.org/ ), iirc czaks and a bunch of other imageboard owners and developers got together half a year or so ago to discuss the development of future imageboards, I'm not sure what's been going on with it now but the channel is #metachan @irc.rizon.net you can read the logs on http://carrier.6irc.net/metachan/

 No.2520

File: 1446494464776.jpg (14.58 KB, 336x365, nano captcha.jpg)

>>2516
I'm sorry to say this but I'm definitely not going to post here if you enable google captcha, I'm not trying to be a douche or sound something like in the lines of "hurr I'm leaving if u don't listen to me". Also it's not like I contributed much to this community, I just saw that you were affiliated with lainchan and I decided to lurk here.

 No.2521

>>2519
I'm in contact with czaks, but he's not so involved with vichan anymore and is trying to find a new maintainer. It's worth asking I guess.

>>2520
I understand your concern, so I'll try to find another solution first. At least the ghost thread bug is fixed so the CP threads won't stick around after we delete them. It makes it less obnoxious in the meantime while I come up with something.

 No.2522

>>2521
I'll stay tuned and lurk around in the meantime

 No.2523

For now I'm going to study the CP spam and try to filter some common phrases which appear more often. At least it should reduce the volume of spam.

 No.2524

ITT: freetards complaining about privacy over anonymous posts about laziness and shitposting.

 No.2525

Plz2 explain how captchas enable datamining.

Beyond that, I toss in a vote for more janitors with a tempzone outside of the public eye that only a core-tier (or just site admin) can view to keep an eye on the janitors and reactivate posts that didn't actually need to be removed.

(Not saying I favor captcha, I just dislike CP spam as much as most people. I actually HATE captchas because I just wanna post dammit >_<)

 No.2526

File: 1446527625151.png (1.24 MB, 958x916, 1446188947815.png)

>>2523
Make sure you 'study' those pictures long and hard!

 No.2527

>>2526
>long and hard
heh

 No.2528

>>2523
So, what would this mean for Janitors?

 No.2530

File: 1446570922093.png (635.47 KB, 1000x750, 1441132396772.png)

>>2528
It means
>u nigs do ur fucking job I have to literally design a word filter because janitors are ineffective as fuck

 No.2531

File: 1446592033493.jpg (48.54 KB, 800x535, 3792799-robber-with-laptop.jpg)

I have been lurking this image board for a small amount of time mainly on /n/. Can someone fill me in on what's going on? Reading through this thread it seems we're being spammed with CP and such. Does anyone know who is doing it and what motives they have?

 No.2532

File: 1446592125909.jpg (125.56 KB, 644x582, 1370650976120.jpg)

The blocklist

It's not working

 No.2533

>>2531
>Does anyone know who is doing it
Nobody in particular, if that's what you're asking. At least that's what I believe, it could be one of the spanish forum guys still mad or something for all we know.

>and what motives they have?

Spam for the sake of spam, to catch people interested in the material and try to get dem shekels.

>>2532
I personally believe they're not bots, since I saw this same shit in 8chan. That's also why I think captchas ain't gonna do, since they didn't work last time.
Sei, do you think it'd be possible to create a global "minimal time" between each post users make? Since, from what I recall, the "flood" detector only works if you're trying to post with a similar body in the same board, but not in different ones.

 No.2534

>>2533
We already have a minimum time between posts. I think they're just waiting it out.

 No.2535

File: 1446593107375.gif (266.11 KB, 500x281, LOL.gif)

mfw everything fails

 No.2536

>>2534
I have an idea it is stupid but it could work. Set up bans so that the person doesn't know that they are banned, allow them to post however hide the banned users post from everyone else so that they think they are spamming but in reality the only ones able to see the posts are admins and the poster

 No.2538

File: 1446597157530.jpg (106.57 KB, 932x651, happy-anime-reaction-gif-9.jpg)

We've come up with a solution, but it'll take some time to implement because we have to edit vichan's core. Banning image hashes is something that was only partially built into tinyboard in a usable capacity and unfortunately vichan hasn't expanded it any.

It's not a permanent solution, but since they (mostly) post the same image it'll slow them down until they cycle in a new one. Word filtering and ip banning don't seem to work either so this feels like the best option right now.

We'll keep you posted, but in the meantime just keep reporting posts like usual!

 No.2539

oops i forgot how to use my capcode

 No.2540

>>2536
I had that same idea a little while ago because it would be funny to punk the spammer like that, but it wouldn't work because they keep cycling between IP addresses whether we ban them or not.

 No.2541

>>2540
I also forgot how to use my capcode.

 No.2542

>>2541
Perhaps start to ban using vpn, proxies and tor nodes like 4chan is atm

 No.2543

>>2542
We've done that since the beginning.

 No.2544

>>2543
Any solution found yet?

 No.2545

>>2538
If you guys manage to come up with a way to actually do this, please push it through to the vichan git and don't sit on it. A lot of vichan imageboards have a terrible problem with spam and it would help a lot. Or at least make it publicly available. I run a french chan and we get spammed to all hell with CP, and even though I've banned entire countries it generally does not help.

 No.2550

>>2538
I literally just >>2538

>>2544
It won't be that big of a modification, if you know php and can find where the post filters are in vichan you can pretty easily see where the idea for this is going.

 No.2557

File: 1447501837008.png (1000.13 KB, 628x788, 1403478160117.png)

>super fap

 No.2566

File: 1447747357604.jpg (49.15 KB, 640x360, Hacker1.jpg)

We now have someone working on advanced countermeasures.

 No.2567

File: 1447760288055.jpg (5.85 KB, 145x145, 1369429307271.jpg)

The advanced countermeasures,

it does nothing.

 No.2568

File: 1447763608684.jpg (30.37 KB, 369x292, 1447501741369.jpg)

Just shove captchas here I guess.

 No.2569

File: 1447772902906.jpg (104.97 KB, 608x430, 1441762739878.jpg)

>>2566
>advanced countermeasures.
Go Stallman, go!

 No.2570

File: 1447787451936.gif (1018.02 KB, 317x218, STAFF.gif)

Dont worry, Ubuu staff is preparing advanced countermeasures

 No.2571

>>2568
Captchas don't work against humans. The spammer is confirmed human.

I'm not going to describe here the countermeasures we're developing because we don't know if the spammer is watching our site. Last time we tried to stop the spam they adjusted their bots within the day to defeat our efforts.

 No.2641

Hmm. As a test, I could try putting the site behind Cloudflare. Maybe one of their countermeasures will catch the bot.

y/n?

 No.2649

File: 1451841345702.webm (2.72 MB, 845x480, Zetsubou Ramen.webm)

>>2641
y

The alternative is to get more janitors.
I'm tired of reporting CP threads with hours of being on the frontpage.

 No.2650

More janitors, I logged on at school once and there would a bunch of disgusting spam so I was like NOPE

 No.2651

Cloudflare is activated. Let's hope this works.

 No.2653

File: 1452546116271.jpg (23.35 KB, 478x350, 1449447484093.jpg)

Didn't work, get competent janitors ffs

 No.2654

We will occasionally keep trying new methods of thwarting the CP spam. Haven't given up yet.

 No.2657

>>2654
Thank you very much for the bugfixes.

 No.2826

Have you considered using heuristics like Mozilla Thunderbird's disturbingly effective junk email filter uses? It is "trained" by marking messages as junk or not junk and after a sufficient number of messages it begins to gain a pretty high degree of accuracy. It won't stop 100% of the crap but I would be surprised if it didn't cut down significantly on the spam. You could also consider using shadowban-like tactics to make it harder for the spammers to know that their posts failed; if a post matches a junk heuristic check, let it "post successfully" for that particular IP address and show in the thread as usual, but place it in a moderator queue before allowing it to show site-wide (this also gives the mods the chance to catch false positives and further refine the heuristics).

 No.2827

>>2826
To save you the time of finding it, the source (C++) is at http://hg.mozilla.org/comm-central/file/tip/mailnews/extensions/bayesian-spam-filter/src and you'll probably be interested in the info at https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering

I also just had the idea that you could make messages not post immediately even if not shadowbanned, but rather appear after an unspecified delay longer than just a few seconds. This would make it harder for the spammer to identify a shadowban condition by comparing the loaded page from a different IP address than the posting one.

I'd also suggest detecting open proxies and Tor exit nodes and possibly banning the use of them if that's where a good chunk of the spam is originating.

 No.2848

just a small comment, the mass reduction of the numbr of boards made it waaaaaay easier to clean up a wave of spam because they have less places to post it : D
but that solves nothing

 No.2850

>>2848
We haven't had spam in a good while. Or at least I didn't notice, which would be weird since I'm 24/7 here.

 No.2851

>>2850
I just cleaned a wave and I haven't been here in a few months so I got no idea how often its been. Just noticed that it was a lot easier to clean than it used to be

 No.2852

File: 1456589144000.jpg (54.38 KB, 480x360, smug37.jpg)

>>2850
>We haven't had spam in a good while.
See Sei?

 No.2853

>>2852
spam came back because i decided to go give gardening advice in /hikki at 12AM on a thursday
look what i've done

 No.2854

Yeah the spam still happens several times a week, up to once a day. But, it hasn't been a problem since Jove made the IRC bot. Every time someone makes a post on the boards, the bot gives us a summary right away in the moderation channel. The spam posts are pretty obvious, and we have someone watching just about all the time, so we usually catch and stop new waves in their tracks in a couple minutes or less nowadays before anyone is likely to see them. For the most part it's put the issue to rest, though it would be great if the spambot would go away for good.

 No.2901

>>2854
As a simpler solution to an IRC bot (I spoke to you on IRC about spamming like a week ago), I enabled the RSS theme on my board and installed an RSS client on my PC to show new posts.

Word filters are pretty useless since the bots adjust. Good thing is that they always post a generic message, so I can tell when a post is probably a spambot, click the rss popup and immediately D+B.



[Return][Go to top] Catalog [Post a Reply]
Delete Post [ ]
[ yn / yndd / fg / yume ] [ o / lit / media / og / ig / 2 ] [ ot / cc / x / sugg ] [ hikki / rec ] [ news / rules / faq / recent / annex / manage ] [ discord / matrix / scans / mud / minecraft / usagi ] [ sushigirl / lewd.sx / lainzine ]