Improving and expanding hashtags usability

Current functionality of hashtags in Diaspora has several limitations. I’ll list them first (if you can come up with more - feel free to comment):

  1. User can only follow a straight list of tags (i.e. following is defined with logical OR).

  2. User can’t filter tags for not following (i.e. not to see some content which has them).

  3. Viewing options of followed tags in the UI are limited to either viewing all followed tags, or only one.

  4. Search is limited to one tag.

  5. Hashtags themselves don’t allow whitespace in them which is caused by the syntax restrictions and results in awkward unreadable notations like “#areallylonghashtagyoucanbreakyoureyeson”, or people trying to come up with workarounds (like using underscore or camel notation, which proliferates incompatible tags and defeats the purpose of searching by them).

To address these issues, several improvements can be made (these proposals match the problems described above).

1-2. Following by default can still be defined as a simple list (which is equal to tag1 OR tag2 OR tag3 OR … OR tagN). Interested user can be given an advanced option to define a more complex boolean expression for following hashtags. I.e. allow using AND, OR, NOT and parentheses. This will cover both 1 and 2, allowing way more flexible method of following and filtering data.

  1. When using the UI for viewing, one should be given a way to view one, several or all (multiple select) of those hashtags. This is sufficient for the UI case. More complex view will be covered with search (4).

  2. Search should allow boolean expressions, the same way the following in the proposal (1-2).

  3. Syntax of hashtags can be expanded. For example it can allow such form in addition to simple not whitespaced tags:

#(some phrase with spaces)

In the final text it can look like a hyperlink without parentheses:

#some phrase with spaces

Parentheses are used just for definition, to delimit the beginning and the end of the tag. This will give a clean way to avoid multiple incompatible notations for complex multiword hashtags.

Note: This discussion was imported from Loomio. Click here to view the original discussion.

I believe there is a feature proposal out there that could solve many of these by allowing users to create custom streams based on some logic that they define. Been away for a while though, so not quite sure on it’s status.

As for #5, I don’t think allowing whitespace in hashtags is a good idea. The way I view hashtags, they should be as simple and to the point as possible. A quick way to figure out what ideas are trying to be expressed in the post, not an expression of an idea itself. That’s just my opinion though, and the community may see hashtags differently. :slight_smile:

@l3mncakes: About whitespace in hashtags. Users already often use multiword hashtags - you can’t force them to stop doing it, so we just need to accept it. The point of the proposal is to create a better syntax for using them, which will result in cleaner looking hashtags which would follow one convention, reducing incompatible duplicates.

@Shmerl - I don’t advocate forcing anybody to stop using multi-word hashtags if that’s what they want to do, but I don’t see how allowing this will do anything to reduce incompatible duplicates. If anything, I only see it perpetuating them by increasing the possibilities of variation that come out of using multi-word hashtags in the first place. The same way I can’t force people to not use multi-word tags, I also can’t force everybody to start using spaces in their tags. People will still use the no space or underscore or camelcase variations, so all I see this doing is further fracturing the conversation. Since one of the goals of our network is to facilitate topical conversation and interaction, I see this as being counter-intuitive to our goals, which is why I don’t think it’s something we should build into the system just because some people out there use hashtags that way. Of course, I’m far from the end-all decider here, so my single objection doesn’t mean this won’t happen if there is support for it. :wink:

You create a better option, and help people to understand better ways of using multiword hashtags through FAQs and tutorials. In the end people would use whatever they want. But giving an option of using cleaner syntax and better readability is a good thing.

As I said - we are dealing with existing factor - i.e. people are already using multiword hashtags, but there is no clean syntax which would produce readable resutls. That’s the issue that can be addressed. Convergence of hashtags can be also achieved through loose search. I.e. the search can allow ignoring the whitespace and underscores differences, which would allow joining the results.

I just wanted to say I think this is a crucial feature. Just one further thing to suggest – incorporating the possibility of specifying a post-author in these complex follow/search terms. One could thus follow or search for everything a particular person says about a particular subject.

(I mean using @-(name) in the complex search outlined by Shmerl above as a way of seeing posts /by/ that particular user which include the specified hashtags. I suppose there could also be a way of following/searching for posts which include the specified hashtags AND @-mentions of a particular user, i.e. posts about a topic and about that user.)

This would complement the “Aspects” feature. As I understand it, “Aspects” allow the poster to compartmentalise their followers (e.g. “I’m only going to send my political rants to my family”). My suggestion would effectively allow the follower to compartmentalise those they follow (“I only care about X’s posts about Linux, not her holiday photos”).

Some time ago, I think to present a proposal like this,

Is a little improvement Diaspora *search engine

  1. can search a combination of tags, eg + #nature #photo
  2. can search users combining by tags, eg: + #palestine

I pretty like the idea of using a “loose search”, because it would solve the problem of fragmentation of tagging convention, and if someone follow “#supertuxkart”, he would see : #super-tux-kart, #super_tux_kart, #supertuxkart, #SuperTuxKart… It would make the hashtag more usable, I think. Combined to the idea of Juan Santiago and the possibility to follow hashtag combination, it would make the whole diaspora search engine and tagging system way more powerfull.

Multiple-tags search is also usable (and I often find myself desiring it) to refine a search, like: #web #browsers #GPL, or whatever.
I suppose this will came at hand with the ‘multiple-tags-selection’ to refine a certain streaming.

I don’t know about the specific terms.

What if we switch from the [space] character to a comma to separate tags?

I think what follows from this would be a tag field which is seperate from the main content of the post. If this is not implemented, it may result in a lot of clumsiness from users who #like to place #tags #in-line with their #posts instead of appending tags to the end.

Another method might be to separate tags with a double-[space].

In any case, I heartily approve of multiple tag, as well as author+tag searching.

@chris26 @riveravaldez @kazhnuz @shmerl

I think I have distracted the subject of this thread.

Is it appropriate to create a new thread to discuss about combined search?

or someone who draws better than me in English, created here vote for that?

Proposal: multi-word hashtags should be enabled by requiring two hashes

Multi-word hashtags should be enabled by requiring one hash (#) preceding a non-[space] charater and another hash following a non-[space] character.

If there is no hash symbol in the text which follows a non-space character, it should be assumed that the last character in the tag is before the nearest [space]

For example, #this is a multiple word tag#.

And #this sentence only contains one tag.

This would allow people to use single- and multiple-word tags in-line with the text of their post. multi-word hashtags are not created accidentally, no new fields have to be created specifically for hashtags.

Outcome: Not passed. Bad/confusing syntax.


  • Yes: 2
  • Abstain: 2
  • No: 16
  • Block: 0

Note: This proposal was imported from Loomio. Vote details, some comments and metadata were not imported. Click here to view the proposal with all details on Loomio.

@juansantiago : This proposal covers advanced search, so you can add your implementation details if you want.

I think they are related, since following tags and actual search all perform queries based on similar logic. So implementation will likely have a lot of shared parts.

@chris26 : I think such syntax is prone to more problems than more clearly defined #(…) I.e. it has higher ambiguity during analysis. In case of using regular tags you’ll still have to parse the whole text (either until the end, or until the next # used in the next tag) before figuring out whether the previous one was a regular tag or you still didn’t reach the end of the multiword tag. It’s rather demanding and introduces complexity.

On the other hand, if you see #( you know right away that you are starting a multiword tag. It means less ambiguity and simpler parser.

@shmerl : Your vote makes sense to me. This is also a concern of mine. But I wonder. Are we talking about the parser being outright prone to failure? Or are we talking about the amount of work required for a CPU?

If the latter, how much difference are we really talking about?

@chris26 : Any difference can reduce performance. In this case it also affects readability at editing time. When you edit the post, you can easily see beginning ( and ending ). And when you use exactly the same symbol for beginning and end of the tag - you’ll have hard time visually catching them. I would have liked your proposal more if at least you used different symbols for beginning and end (that won’t solve the issue of parser complexity, but at least would improve readability at editing time).

Alternatively you can simply use another symbol altogether to reduce ambiguity (i.e. less common one than round brackets which are often used in text).


{multi word tag}

Then there won’t be a need to put # in the beginning at all and it would be readable and easy to parse.

curly-brackets sounds like a plan to me. Leave the functionality of #, and add {} tags.

I want to second Shmerls argument here. #bla bla# might cause unwanted ambiguities and makes the code bad to test. I’d also prefer something like #(…) #"…" #[…] or #{…}.