Skip to main content

 

About to enforce network post expiry


 
Huh, posted before restricting visibility.

This only applies to the 5 people mentioned in the post.
 
Named and shamed. 😂
 
They should be honoured for being the heavy user guinea pigs. :-D
 
So this is like "reshares" will be deleted?
 
I don't believe so - reshares should, in my opinion, be treated locally as your own posts. I can dig a little deeper and see if I can validate that.

Oh, and to clarify further (sorry for the multiple edits) - the mechanism with the settings I've described is to expire all the items that your account was responsible for bringing on to the system, but aren't your own posts. ie: you're the only person subscribed to example@some.site, so after all those posts from that account have sat around on the system for 6 months, and no other account here is laying claim to those posts as well, then it's time to purge them off the system.

That said, are we worried about reshares from > 6 months ago?
This entry was edited (4 months ago)
 
I've just never thought about content I create on the system expiring on any system I've ever used before. So I'm trying to get my brain wrapped around it. The backup doesn't work so I'm going to try to write a backup script that uses the API tonight.
 
What do you mean by "the backup doesn't work"?
 
When I try to export my data it stalls after about 300 MB and I have no data.
 
Got it. We do try to do everything in one page load which isn't advisable for large amount of data. We should perform it in the background and provide the archive later.
 
Interesting... I'm pretty sure we were having issues with exporting all data... ah yes: https://github.com/friendica/friendica/issues/8169 and https://social.isurf.ca/display/c443a55c-165e-2c7e-4568-e73731233195

But, I just tried doing an export all and got a 16MB dump of stuff... can't really look at it in much detail right now, but it might be in better shape now.
 
I just tried it before posting that in case it was fixed in the latest version. I should be able to get what I need via the API
 
Out of curiosity, @Hank G is there stuff you pull from feeds of others that you feel should be hung on to?

Yes, I eventually want users own posts to not sit around forever either (and I think anyone running an instance with more than 10 people on it will all either find they will need to trim at some point or continue to pay for more and more storage), but right now I'm looking to see how much sanity I can create in the DB in smaller steps.
 
No that's transparent to me. I couldn't care less about that. I don't know how AP works but for Diaspora federation once the federated post is removed from the local service there is no way to render it again. It creates a theoretical UX problem but generally people aren't looking at posts from a year ago from another user. Worst case they can jump back to the user's home server.
 
I'm half way done writing a download script that uses the API. I'm adding throttling to it so that it doesn't accidentally DDoS the server with requests. I have a little over 2000 posts by the way I count it. So even with throttling I hope to have it all archived locally in the next day or two. Fingers crossed...
 
Good stuff! Look forward to hearing how it works for you. I would expect that if it's just grabbing post data of your own, it wouldn't have to be throttled down too terribly crazily. ie: I figure if the requests were spread over a handful of minutes it should be OK.

When I moved Friendica over to the last release candidate, something triggered "everybody" (masses of other instances) to contact our site here. System chugged away processing over 500 reqs/second for about an hour or so. Normal request/second average is under 5.
 
@">Anadam :-D I'm planning on timing the response time from the server. If it's less than 5 seconds I'll make the timeout for 5 seconds. If it's more than five seconds I'll make the timeout the response time. All requests will be handled linearly. It'll also be doing hits for images for each of the posts, users, etc. All of that will be done linearly as well, probably with timeouts of a 300 ms or something like that.
 
...I came up with that because in manual testing the calls to the "api/statuses/user_timeline" endpoint are very very slow. Like "I think it's doing a full table scan" slow.
 
Fine by me. I haven't even thought about expiry since I first joined. I'll do so now.
 
Done. Set to 90 days. I'll start using the star feature to save selected items.
 
Wow, thanks @Phil Landmeier (ᚠ) - that's even better than what I was aiming for at this time.
 
Oh, well, cool. It doesn't feel like I post that much, but I do. Lol.
 
@">Anadam :-D OK...running my backup program now...hopefully the only way you noticed was me telling you :)
 
Yup, very little noticed on the monitoring graphs!
 
That's good to know. The query performance was abysmal. I think it's doing full table scans so that's putting some load on the server. I posted some data to the Friendica matrix channel. Here's the plot of the queries. Average is 10 seconds with the median at over 7 seconds. Also weirdly it's not always returning the 20 requested max posts so I'm not sure if I'm actually getting all of them except for the fact that it is *consistently* returning less than the requested number on the exact same pages: https://cloud.feneas.org/s/p9FaFXttQMPQJcD
 
Unfortunately, we do filter some posts after the query is performed based on user settings. The query has a limit of 20 results returned, so the final result may have less.
 
What other filtering would that be? The documentation doesn't say that 20 is the absolute limit btw. I just happened to set it to that explicitly even though it was the default.
 
If I remember correctly, the user blocks. And I didn't mean that 20 was the absolute, I just was speaking in your case.
 
What are "user blocks"?
 
Users blocking contacts.
 
Ah got it. That doesn't make sense in the context of querying against my own posts though. I'm running the query "api/statuses/user_timeline?user_id=<userID>&count=<count>&page=<page>"
 
Indeed, I'm not sure what other shady filtering we do between the query and the output but I'm not surprised.
 
I don't find that encouraging :). The good news is that from my limited sampling it appears that the drop outs were consistently on the same page so I'm not missing posts. The other weird thing that happened was comments on comments were being returned in the query for posts by the looks of it. Not sure why.
 
A lot of things in Friendica aren't encouraging, but I believe we are fixing them one by one. One day...
 
...and done now...
 
@">Anadam :-D
Thanks for the heads-up. I must have set this when I first signed up, without thinking of the consequences. I'll change it now.
 
Hey @Garry Knight - heh, I had modified the settings yesterday to where they should be, and now you have no expiry settings enabled. The settings that were in place were the defaults from when you signed up, I believe, so it's not anything you did prior to today.
 
Just because I can see this post, which means my opinion got polled, even if only unintentionally, I'll throw in my 2 pennies.

Ironically, when G+ got nuked, my cynical sense of "digital permanence" went with it. Cynical, in that with so many services like Google digitally crawling the web and copying and archiving very-nearly everything, I naively thought they'd just leave it up, and the thousands of value-added comments attached to my blog would last a while. They didn't even last five years.

So, I have a bit of fatigue, I guess, and think a shelf-life of anything posted here makes sense as is part of the natural ecosystem.

I also think it's really cool that you asked people first.

HUGS