Hi Habr!
The other day, I once again got the type code
if(someParameter.Volatilities.IsEmpty()) { // We have to report about the broken channels, however we could not differ it from just not started cold system. // Therefore write this case into the logs and then in case of emergency IT Ops will able to gather the target line Log.Info("Channel {0} is broken or was not started yet", someParameter.Key) }
The code has one rather important feature: the recipient would very much like to know what really happened. Indeed, in one case we have problems with the system, and in the other, we just warm up. However, the model does not give us this (to please the sender, who is often the author of the model).
Moreover, even the fact “maybe something is wrong” stems from the fact that the Volatilities
collection Volatilities
empty. Which in some cases may be correct.
I’m sure that most experienced developers in the code saw lines that contained secret knowledge in the style of "if this combination of flags is set, then we are asked to make A, B and C" (although this is not visible by the model itself).
From my point of view, such a saving on the structure of classes has an extremely negative effect on the project in the future, turning it into a set of hacks and crutches, gradually transforming a more or less convenient code into legacy.
Important: in the article I give examples that are useful for projects in which several developers (and not one), plus which will be updated and expanded for at least 5-10 years. All this does not make sense if the project has one developer for five years, or if no changes are planned after the release. And it is logical, if the project is needed for only a couple of months, there is no point in investing in a clear data model.
However, if you are doing long-playing - welcome to cat.
Often the same field contains an object that can have different semantic meanings (as in the example). However, to save classes, the developer leaves only one type, supplying it with flags (or comments in the style of "if there is nothing here, then nothing was counted"). A similar approach may mask an error (which is bad for the project, but convenient for the team that supplies the service, because the bugs are not visible from the outside). A more correct option, which allows even at the far end of the wire to find out what is actually happening, is to use the interface + visitors.
In this case, the example from the header turns into code of the form:
class Response { public IVolatilityResponse Data { get; } } interface IVolatilityResponse { TOutput Visit<TInput, TOutput>(IVolatilityResponseVisitor<TInput, TOutput> visitor, TInput input) } class VolatilityValues : IVolatilityResponse { public Surface Data; TOutput Visit<TInput, TOutput>(IVolatilityResponseVisitor<TInput, TOutput> visitor, TInput input) => visitor.Visit(this, input); } class CalculationIsBroken : IVolatilityResponse { TOutput Visit<TInput, TOutput>(IVolatilityResponseVisitor<TInput, TOutput> visitor, TInput input) => visitor.Visit(this, input); } interface IVolatilityResponseVisitor<TInput, TOutput> { TOutput Visit(VolatilityValues instance, TInput input); TOutput Visit(CalculationIsBroken instance, TInput input); }
With this kind of processing:
Response
json
protobuf
IVolatilityResponseVisitor<TInput, TOutput>
A number of other languages (for example, Scala
or Kotlin
) have keywords that allow you to prohibit inheriting from a certain type, under certain conditions. Thus, at the compilation stage, we know all the possible heirs of our type.
In particular, the example above can be rewritten in Kotlin
like this:
class Response ( val data: IVolatilityResponse ) sealed class VolatilityResponse class VolatilityValues : VolatilityResponse() { val data: Surface } class CalculationIsBroken : VolatilityResponse()
It turned out a little less than the code, but now in the compilation process we know that all possible VolatilityResponse
are in the same file with it, which means that the following code will not compile, since we did not go through all the possible values of the class.
fun getResponseString(response: VolatilityResponse) = when(response) { is VolatilityValues -> data.toString() }
However, it is worth remembering that such checks work only for functional calls. The code below will compile without errors:
fun getResponseString(response: VolatilityResponse) { when(response) { is VolatilityValues -> println(data.toString()) } }
Consider a relatively typical development for a database. Most likely, somewhere in the code you will have object identifiers. For example:
class Group { public int Id { get; } public string Name { get; } } class User { public int Id { get; } public int GroupId { get; } public string Name { get; } }
It seems like a standard code. The types even match those in the database. However, the question is: is the code below correct?
public bool IsInGroup(User user, Group group) { return user.Id == group.Id; } public User CreateUser(string name, Group group) { return new User { Id = group.Id, GroupId = group.Id, name = name } }
The answer is most likely not, since we compare the user Id
and group Id
in the first example. And in the second, we mistakenly set the id
from Group
as the id
from User
.
Oddly enough, this is quite simple to fix: just get the types GroupId
, UserId
and so on. Thus, the creation of User
will no longer work, since your types will not converge. Which is incredibly cool, because you could tell the compiler about the model.
Moreover, methods with the same parameters will work correctly for you, since now they will not be repeated:
public void SetUserGroup(UserId userId, GroupId groupId) { /* some sql code */ }
However, let us return to the comparison of identifiers. It is a little more complicated, since you must prevent the compiler from comparing the incomparable during the build process.
And you can do this as follows:
class GroupId { public int Id { get; } public bool Equals(GroupId groupId) => Id == groupId?.Id; [Obsolete("GroupId can be equal only with GroupId", error: true)] public override bool Equals(object obj) => Equals(obj as GroupId) public static bool operator==(GroupId id1, GroupId id2) { if(ReferenceEquals(id1, id2)) return true; if(ReferenceEquals(id1, null) || ReferenceEquals(id2, null)) return false; return id1.Id == id2.Id; } [Obsolete("GroupId can be equal only with GroupId", error: true)] public static bool operator==(object _, GroupId __) => throw new NotSupportedException("GroupId can be equal only with GroupId") [Obsolete("GroupId can be equal only with GroupId", error: true)] public static bool operator==(GroupId _, object __) => throw new NotSupportedException("GroupId can be equal only with GroupId") }
As a result:
GroupId
GroupId
IEquitable
IEquitable
GetHashCode
Often, in our type applications, additional rules are introduced that are easy to verify. In the worst case, a number of functions look something like this:
void SetName(string name) { if(name == null || name.IsEmpty() || !name[0].IsLetter || !name[0].IsCapital || name.Length > MAX_NAME_COLUMN_LENGTH) { throw .... } /**/ }
That is, the function accepts a fairly wide type of input, and then runs the checks. This is generally not the case since:
string
name
The correct behavior:
Name
string
Name
As a result, we get:
name
void UpdateData(Name name, Email email, PhoneNumber number)
string
Introducing a fairly strict typing, we should also not forget that when transferring data to Sql, we still need to get a real identifier. And in this case, it is logical to slightly update the types that wrap one string
:
interface IValueGet<TValue>{ TValue Wrapped { get; } }
interface IValueGet<TValue>{ TValue Wrapped { get; } }
interface IValueGet<TValue> { TValue Wrapped { get; } } abstract class BaseWrapper : IValueGet<TValue> { protected BaseWrapper(TValue initialValue) { Wrapped = initialValue; } public TValue Wrapped { get; private set; } } sealed class Name : BaseWrapper<string> { public Name(string value) :base(value) { /*no necessary validations*/ } } sealed class UserId : BaseWrapper<int> { public UserId(int id) :base(id) { /*no necessary validations*/ } }
Speaking about creating a large number of types, you can often meet two dialectical arguments:
Strictly speaking, both arguments are often given without facts, however:
int
string
Important: you should deal with such optimizations only when you receive guaranteed facts that it is microtypes that slow down the application. In my experience, such a situation is rather impossible. With a higher probability, the same logger will slow you down , because each operation is waiting for a flush to disk (everything was acceptable on the developer's computer with M.2 SSD, but a user with an old HDD sees completely different results).
However, the tricks themselves:
struct
Dictionary
Map
inline
ToString
GetHashCode
Equals
GetHashCode
GetHashCode
Equals
ToUpperInvariant
Once again I will repeat the important point from the title: all the things described in the article make sense in large projects that have been developed and used for years. In those where it is meaningful to reduce the cost of support and reduce the cost of adding new functionality. In other cases, it is often most reasonable to make a product as quickly as possible without bothering with tests, models and “good code”.
However, for long-term projects, it is reasonable to use the most strict typing, where in the model we can strictly describe what values are possible in principle.
If your service can sometimes return a non-working result, then express it in the model and show it to the developer explicitly. Do not add a thousand flags with descriptions in the documentation.
If your types can be the same in the program, but they are different in essence of the business, then define them exactly as different. Do not mix them, even if the types of their fields are the same.
If you have questions about productivity, apply the scientific method and take a test (or better, ask an independent person to check all this). In this scenario, you will actually speed up the program, and not just waste the time of the team. However, the opposite is also true: if there is a suspicion that your program or library is slow, then do a test. No need to say that everything is fine, just show it in numbers.